I am trying to understand prolog and definite clause grammar but I am having a very hard time understanding both of them.
I am really trying to understand how to use the dcg syntax...
Here I have two examples:
The first is actually the code from another question on this forum but with an additional question:
The code looked like this:
s --> first, operator, second.
first --> [X].
operator --> ['+'].
second --> [X].
And when Prolog is asked about this, it returns true/false but I can't for the life of me figure out how to actually modify this to "bind" the value so if asked s(X, [2,+,2],[]). it would return the value of first, so instead of returning true it'd say X = 2
Anyway back to the actual question. I have a few rules in normal prolog and this is one of them; it doesn't actually do anything and was merely made up as an example.
do(X, Y, [H|T], Sum):-
H == 1, %check if H is 1
X = H,
Y = T,
Additional is H+5,
Sum is Additional+Additional.
Basically, I am asking if someone could translate this to DCG so that I could try and understand the basic syntax of DCG! I've tried reading some tutorials but I feel like I haven't gotten any wiser...
DCG:
foo(A1, A2, A3, ... , An) --> bar.
Prolog:
foo(A1, A2, A3, ... , An, X, Y) :- bar(X,Y)
So, s should be changed to:
s(X) --> first(X), operator, second.
first(X) --> [X].
operator --> ['+'].
second --> [X].
Of course, it might be better to return the actual result; to do this you should encapsulate prolog code in the DCG clause which is done with {}:
s(Z) --> first(X), operator, second(Y), {Z is X+Y}.
first(X) --> [X].
operator --> ['+'].
second(X) --> [X].
(naturally, if you have more operators, the prolog code won't be that simple).
Regarding the do/4 predicate, it should be something like this:
do(X,Y,[H|T],Sum) -->
{H == 1, %check if H is 1
X = H,
Y = T,
Additional is H+5,
Sum is Additional+Additional}.
but I don't see why you would want that.
One last tip: it's recommended to use phrase/3 instead of adding the last two arguments in a DCG predicate.
it's not easy to translate do/4 to DCG in meaningful way. I've removed arguments that 'copy' the hidden arguments of the DCG.
do(Sum) -->
[1], %check if H is 1
{ % braces allow 'normal' Prolog code (but we have no access to 'hidden' arguments)
Additional is H+5,
Sum is Additional+Additional
}.
edit sorry I forgot H in Additional is H+5,, should read Additional is 1+5,...
Related
This is the predicate that does what it should, namely, collect whatever is left on input when part of a DCG:
rest([H|T], [H|T], []).
rest([], [], []).
but I am struggling to define this as a DCG... Or is it at all doable?
This of course is not the same (although it does the same when used in the same manner):
rest([H|T]) --> [H], !, rest(T).
rest([]) --> [].
The reason I think I need this is that the rest//1 is part of a set of DCG rules that I need to parse the input. I could do phrase(foo(T), Input, Rest), but then I would have to call another phrase(bar(T1), Rest).
Say I know that all I have left on input is a string of digits that I want as an integer:
phrase(stuff_n(Stuff, N), `some other stuff, 1324`).
stuff_n(Stuff, N) -->
stuff(Stuff),
rest(Rest),
{ number_codes(N, Rest),
integer(N)
}.
Answering my own silly question:
#CapelliC gave a solution that works (+1). It does something I don't understand :-(, but the real issue was that I did not understand the problem I was trying to solve. The real problem was:
Problem
You have as input a code list that you need to parse. The result should be a term. You know quite close to the beginning of this list of codes what the rest looks like. In other words, it begins with a "keyword" that defines the contents. In some cases, after some point in the input, the rest of the contents do not need to be parsed: instead, they are collected in the resulting term as a code list.
Solution
One possible solution is to break up the parsing in two calls to phrase/3 (because there is no reason not to?):
Read the keyword (first call to phrase/3) and make it an atom;
Look up in a table what the rest is supposed to look like;
Parse only what needs to be parsed (second call to phrase/3).
Code
So, using an approach from (O'Keefe 1990) and taking advantage of library(dcg/basics) available in SWI-Prolog, with a file rest.pl:
:- use_module(library(dcg/basics)).
codes_term(Codes, Term) :-
phrase(dcg_basics:nonblanks(Word), Codes, Codes_rest),
atom_codes(Keyword, Word),
kw(Keyword, Content, Rest, Term),
phrase(items(Content), Codes_rest, Rest).
kw(foo, [space, integer(N), space, integer(M)], [], foo(N, M)).
kw(bar, [], Text, bar(Text)).
kw(baz, [space, integer(N), space], Rest, baz(N, Rest)).
items([I|Is]) -->
item(I),
items(Is).
items([]) --> [].
item(space) --> " ".
item(integer(N)) --> dcg_basics:integer(N).
It is important that here, the "rest" does not need to be handled by a DCG rule at all.
Example use
This solution is nice because it is deterministic, and very easy to expand: just add clauses to the kw/4 table and item//1 rules. (Note the use of the --traditional flag when starting SWI-Prolog, for double-quote delimited code lists)
$ swipl --traditional --quiet
?- [rest].
true.
?- codes_term("foo 22 7", T).
T = foo(22, 7).
?- codes_term("bar 22 7", T).
T = bar([32, 50, 50, 32, 55]).
?- codes_term("baz 22 7", T).
T = baz(22, [55]).
An alternative (that doesn't leave a choice point behind) is to use the call//1 built-in non-terminal with a lambda expression. Using Logtalk's lambda expression syntax to illustrate:
rest(Rest) --> call({Rest}/[Rest,_]>>true).
This solution is a bit nasty, however, as it uses a variable with a dual role in the lambda expression (which triggers a warning with the Logtalk compiler). An usage example:
:- object(rest).
:- public(test/2).
test(Input, Rest) :-
phrase(input(Rest), Input).
input(Rest) --> [a,b,c], rest(Rest).
rest(Rest) --> call({Rest}/[Rest,_]>>true).
% rest([C|Cs]) --> [C|Cs]. % Carlo's solution
:- end_object.
Assuming the above object is saved in a dcg_rest.lgt source file:
$ swilgt
...
?- {dcg_rest}.
* Variable A have dual role in lambda expression: {A}/[A,B]>>true
* in file /Users/pmoura/Desktop/dcg_rest.lgt between lines 13-14
* while compiling object rest
% [ /Users/pmoura/Desktop/dcg_rest.lgt loaded ]
% 1 compilation warning
true.
?- rest::test([a,b,c,d,e], Rest).
Rest = [d, e].
You should be able to get the same results using other lambda expressions implementation such as Ulrich's lambda library.
could be
rest([C|Cs]) --> [C|Cs] .
at least in SWI-Prolog, it seems to run (I used library(dcg/basics) to get the number)
line(I,R) --> integer(I), rest(R).
?- phrase(line(N,R), `6546 okok`).
N = 6546,
R = [32, 111, 107, 111, 107]
I have a simple grammar, which takes 3 list items and runs a different dcg rule on each.
[debug] ?- phrase(sentence(X), [sky, a, 1], []).
X = [bright, amber, on] .
Code:
sentence([A,C,R]) -->
analyse(A),
colour(C),
rating(R).
analyse(bright) --> [sky].
analyse(dark) --> [cave].
colour(red) --> [r].
colour(amber) --> [a].
colour(green) --> [g].
rating(on) --> [1].
rating(off) --> [0].
This works fine.
My problem is that my input list needs needs to have 2 items, not 3, and the second atom is a concat atom of colour and rating:
[sky, a1]
So somehow I have to (?) split this atom into [a, 1] and then the colour and rating rules will work with a simple dcg rule.
I can't work out how to do this..obviously with normal prolog, I'd just use atom_chars, but I can't work out how to interleave this with the grammar.
In a perfect world, it feels like I should not have to resort to using atom_chars, and I should be able to come up with a simple dcg rule to split it, but I'm not sure if this is possible, since we are parsing lists, not atoms.
As you have said yourself, you just need to use a predicate like atom_chars/2. You can interleave normal code into a DCG rule by enclosing it in { and }.
But there is something fishy about your problem definition. As you have also said yourself, you are parsing a list, not an atom. The list you are parsing should be already properly tokenized, otherwise you cannot expect to define a DCG that can parse it. Or am I seeing this wrong?
So in other words: you take your input, split into single chars, tokenize that using a DCG. Depending on your input, you can do the parsing in the same step.
It was clear that a refined DCG rule could work, but, alas, it took too much time to me to craft a solution for your problem.
Here it is:
sentence([A,C,R]) -->
analyse(A),
colour(C),
rating(R).
analyse(bright) --> [sky].
analyse(dark) --> [cave].
colour(red) --> [r].
colour(amber) --> [a].
colour(green) --> [g].
colour(X), As --> [A], {
atom_codes(A, Cs),
maplist(char2atomic, Cs, L),
phrase(colour(X), L, As)}.
rating(on) --> [1].
rating(off) --> [0].
char2atomic(C, A) :- code_type(C, digit) -> number_codes(A, [C]) ; atom_codes(A, [C]).
yields
?- phrase(sentence(X), [sky, a1], []).
X = [bright, amber, on]
the key it's the use of 'pushback' (i.e. colour(X), As -->...).
Here we split the unparsable input, consume a token, and push back the rest...
As usual, most of time was required to understand where my first attempt failed: I was coding char2atomic(C, A) :- atom_codes(A, [C])., but then rating//1 failed...
I am trying to define a palindrome where the number of a's is one less than the number of b's.
I cant seem to figure out how to write it properly
please-->palindromes.
palindromes-->[].
palindromes-->[a].
palindromes-->[b].
palindromes--> [b],palindromes,[b].
Think about this: where the surplus 'b' could stay ? In a palindrome, there is only one such place. Then change the symmetric definition, that in BNF (you already know as translate to DCG) would read
S :: P
P :: a P a | b P b | {epsilon}
You're on the right track, you just need a way to deal with the difference in counts. You can do this by adding a numeric argument to your palindromes grammar term.
First I'll define an ordinary Prolog rule implementing "B is two more than A":
plus2(A,B) :- number(A), !, B is A+2.
plus2(A,B) :- number(B), !, A is B-2.
plus2(A,B) :- var(A), var(B), throw(error(instantiation_error,plus2/2)).
Then we'll say palindromes(Diff) means any palindrome on the given alphabet where the number of b letters minus the number of a letters is Diff. For the base cases, you know Diff exactly:
palindromes(0) --> [].
palindromes(-1) --> [a].
palindromes(1) --> [b].
For the recursive grammar rules, we can use a code block in {braces} to check the plus2 predicate:
palindromes(DiffOuter) --> [b], palindromes(DiffInner), [b],
{ plus2(DiffInner, DiffOuter) }.
palindromes(DiffOuter) --> [a], palindromes(DiffInner), [a],
{ plus2(DiffOuter, DiffInner) }.
To finish off, the top-level grammar rule is simply
please --> palindromes(1).
I need some help here with Prolog.
So I have this function between that evaluates if an element is between other two.
What I need now is a function that evaluates if a member is not between other two, even if it is the same as one of them.
I tried it :
notBetween(X,Y,Z,List):-right(X,Y,List),right(Z,Y,List). // right means Z is right to Y and left the same for the left
notBetween(X,Y,Z,List):-left(X,Y,List),left(Z,Y,List).
notBetween(X,Y,Z,List):-Y is Z;Y is X.
I am starting with Prolog so maybe it is not even close to work, so I would appreciate some help!
When it come to negation, Prolog behaviour must be handled more carefully, because negation is 'embedded' in the proof engine (see SLD resolution to know a little more about abstract Prolog). In your case, you are listing 3 alternatives, then if one will not be true, Prolog will try the next. It's the opposite of what you need.
There is an operator (\+)/2, read not. The name has been chosen 'on purpose' different than not, to remember us that it's a bit different from the not we use so easily during speaking.
But in this case it will do the trick:
notBeetwen(X,Y,Z,List) :- \+ between(X,Y,Z,List).
Of course, to a Prolog programmer, will be clearer the direct use of \+, instead of a predicate that 'hides' it - and requires inspection.
A possibile definition of between/4 with basic lists builtins
between(X,Y,Z,List) :- append(_, [X,Y,Z|_], List) ; append(_, [Z,Y,X|_], List).
EDIT: a simpler, constructive definition (minimal?) could be:
notBetween(X,Y,Z, List) :-
nth1(A, List, X),
nth1(B, List, Y),
nth1(C, List, Z),
( B < A, B < C ; B > A, B > C ), !.
EDIT: (==)/2 works with lists, without side effects (it doesn't instance variables). Example
1 ?- [1,2,3] == [1,2,3].
true.
2 ?- [1,2,X] == [1,2,X].
true.
3 ?- [1,2,Y] == [1,2,X].
false.
I have a manually made DCG rule to select idiomatic phrases
over single words. The DCG rule reads as follows:
seq(cons(X,Y), I, O) :- noun(X, I, H), seq(Y, H, O), \+ noun(_, I, O).
seq(X) --> noun(X).
The first clause is manually made, since (:-)/2 is used instead
of (-->)/2. Can I replace this manually made clause by
some clause that uses standard DCG?
Best Regards
P.S.: Here is some test data:
noun(n1) --> ['trojan'].
noun(n2) --> ['horse'].
noun(n3) --> ['trojan', 'horse'].
noun(n4) --> ['war'].
And here are some test cases, the important test case is the first test case, since it does only
deliver n3 and not cons(n1,n2). The behaviour of the first test case is what is especially desired:
?- phrase(seq(X),['trojan','horse']).
X = n3 ;
No
?- phrase(seq(X),['war','horse']).
X = cons(n4,n2) ;
No
?- phrase(seq(X),['trojan','war']).
X = cons(n1,n4) ;
No
(To avoid collisions with other non-terminals I renamed your seq//1 to nounseq//1)
Can I replace this manually made clause by some clause that uses standard DCG?
No, because it is not steadfast and it is STO (details below).
Intended meaning
But let me start with the intended meaning of your program. You say you want to select idiomatic phrases over single words. Is your program really doing this? Or, to put it differently, is your definition really unique? I could now construct a counterexample, but let Prolog do the thinking:
nouns --> [] | noun(_), nouns.
?- length(Ph, N), phrase(nouns,Ph),
dif(X,Y), phrase(nounseq(X),Ph), phrase(nounseq(Y),Ph).
Ph = [trojan,horse,trojan], N = 3, X = cons(n1,cons(n2,n1)), Y = cons(n3,n1)
; ...
; Ph = [trojan,horse,war], N = 3, X = cons(n3,n4), Y = cons(n1,cons(n2,n4))
; ... .
So your definition is ambiguous. What you essentially want (probably) is some kind of rewrite system. But those are rarely defined in a determinate manner. What, if two words overlap like an additional noun(n5) --> [horse, war]. etc.
Conformance
A disclaimer up-front: Currently, the DCG document is still being developed — and comments are very welcome! You find all material in this place. So strictly speaking, there is at the current point in time no notion of conformance for DCG.
Steadfastness
One central property a conforming definition must maintain is the property of steadfastness. So before looking into your definition, I will compare two goals of phrase/3 (running SWI in default mode).
?- Ph = [], phrase(nounseq(cons(n4,n4)),Ph0,Ph).
Ph = [], Ph0 = [war,war]
; false.
?- phrase(nounseq(cons(n4,n4)),Ph0,Ph), Ph = [].
false.
?- phrase(nounseq(cons(n4,n4)),Ph0,Ph).
false.
Moving the goal Ph = [] at the end, removes the only solution. Therefore, your definition is not steadfast. This is due to the way how you handle (\+)/1: The variable O must not occur within the (\+)/1. But on the other hand, if it does not occur within (\+)/1 you can only inspect the beginning of a sentence. And not the entire sentence.
Subject to occurs-check property
But the situation is worse:
?- set_prolog_flag(occurs_check,error).
true.
?- phrase(nounseq(cons(n4,n4)),Ph0,Ph).
ERROR: noun/3: Cannot unify _G968 with [war|_G968]: would create an infinite tree
So your program relies on STO-unifications (subject-to-occurs-check unifications) whose outcome is explicitly undefined in
ISO/IEC 13211-1 Subclause 7.3.3 Subject to occurs-check (STO) and not subject to occurs-check (NSTO)
This is rather due to your intention to define the intersection of two non-terminals. Consider the following way to express it:
:- op( 950, xfx, //\\). % ASCII approximation for ∩ - 2229;INTERSECTION
(NT1 //\\ NT2) -->
call(Xs0^Xs^(phrase(NT1,Xs0,Xs),phrase(NT2,Xs0,Xs))).
% The following is predefined in library(lambda):
^(V0, Goal, V0, V) :-
call(Goal,V).
^(V, Goal, V) :-
call(Goal).
Already with this definition we can get into STO situations:
?- phrase(([a]//\\[a,b]), Ph0,Ph).
ERROR: =/2: Cannot unify _G3449 with [b|_G3449]: would create an infinite tree
In fact, when using rational trees we get:
?- set_prolog_flag(occurs_check,false).
true.
?- phrase(([a]//\\[a,b]), Ph0,Ph).
Ph0 = [a|_S1], % where
_S1 = [b|_S1],
Ph = [b|_S1].
So there is an infinite list which certainly has not much meaning for natural language sentences (except for persons of infinite resource and capacity...).