How to build Prolog grammar parse tree consisting of two sentences joined by a conjunction - prolog

I have following Prolog code to recognise a sentence. Notice that it builds a parse tree for the grammar too.
sentence(plural,s(Np,Vp)) -->
noun_phrase(plural,Np),
verb_phrase(plural,Vp).
sentence(singular,s(Np,Vp)) -->
noun_phrase(singular,Np),
verb_phrase(singular,Vp).
I need to have a predicate that can recognise a compound sentence (it consists of two sentences joined by a conjunction). I came up with following code but execution fails. Of course, in my Prolog code there are definitions for noun_phrase, verb_phrase and so on.
compound_sentence(comp_s(s1,Conj,s2)) -->
sentence(_,s1(Np,Vp)),
conjuction(_,Conj),
sentence(_,s2(Np,Vp)).
e.g. When I run this query, it will fail.
?- phrase(compound_sentence(_),
[the,reboot,is,a,success,and,the,user,does,a,save]).
How do you go about detecting compound sentences?

The reason why query fails:
phrase(compound_sentence(_), ...)
because (a) the two subgoals sentence(, s1(Np,Vp)) cannot match the parse tree sentence/2 is building: sentence(, s(Np,Vp)). And (b) the two sentences cannot have the same Np and Vp. Try something like this:
compound_sentence(comp_s(S1,Conj,S2)) -->
sentence(_, S1),
conjuction(_,Conj),
sentence(_, S2).
where S1 = s(Np1, Vp1) corresponding to the first sentence, and S2 = s(Np2, Vp2) for the second.

Related

Expanding DCGs in Prolog

I'm writing a code generator that converts definite clause grammars to other grammar notations. To do this, I need to expand a grammar rule:
:- initialization(main).
main :-
-->(example,A),writeln(A).
% this should print ([a],example1), but this is a runtime error
example --> [a],example1.
example1 --> [b].
But -->(example, A) doesn't expand the rule, even though -->/2 appears to be defined here. Is there another way to access the definitions of DCG grammar rules?
This is a guess of what your are expecting and why you are having a problem. It just bugs me because I know you are smart and should be able to connect the dots from the comments. (Comments were deleted when this was posted, but the OP did see them.)
This is very specific to SWI-Prolog.
When Prolog code is loaded it automatically goes through term expansion as noted in expand.pl.
Any clause with --> will get expanded based on the rules of dcg_translate_rule/2. So when you use listing/1 on the code after it is loaded, the clauses with --> have already been expanded. So AFAIK you can not see ([a],example1) which is the code before loading then term expansion, but example([a|A], B) :- example(A, B) which is the code after loading and term expansion.
The only way to get the code as you want would be to turn off the term expansion during loading, but then the code that should have been expanded will not and the code will not run.
You could also try and find the source for the loaded code but I also think that is not what you want to do.
Based on this I'm writing a code generator that converts definite clause grammars to other grammar notations. perhaps you need to replace the code for dcg_translate_rule/2 or some how intercept the code on loading and before the term expansion.
HTH
As for the error related to -->(example,A),writeln(A). that is because that is not a valid DCG clause.
As you wrote on the comments, if you want to convert DCGs into CHRs, you need to apply the conversion before the default expansion of DCGs into clauses. For example, assuming your code is saved to a grammars.pl file:
?- assertz(term_expansion((H --> B), '--->'(H,B))).
true.
?- assertz(goal_expansion((H --> B), '--->'(H,B))).
true.
?- [grammars].
[a],example1
true.

Ignore rest of input

What's the preferred way to ignore rest of input? I found one somewhat verbose way:
ignore_rest --> [].
ignore_rest --> [_|_].
And it works:
?- phrase(ignore_rest, "foo schmoo").
true ;
But when I try to collapse these two rules into:
ignore_rest2 --> _.
Then it doesn't:
?- phrase(ignore_rest2, "foo schmoo").
ERROR: phrase/3: Arguments are not sufficiently instantiated
What you want is to state that there is a sequence of arbitrarily many characters. The easiest way to describe this is:
... -->
[].
... -->
[_],
... .
Using [_|_] as a non-terminal as you did, is an SWI-Prolog specific extension which is highly problematic. In fact, in the past, there were several different extensions to/interpretations of [_|_]. Most notably Quintus Prolog did permit to define a user-defined '.'/4 to be called when [_|_] was used as a non-terminal. Note that [_|[]] was still considered a terminal! Actually, this was rather an implementation error. But nevertheless, it was exploited. See for such an example:
David B. Searls, Investigating the Linguistics of DNA with Definite Clause Grammars. NACLP 1989.
Why not simply use phrase/3 instead of phrase/2? For example, assuming that you have a prefix//0 non-terminal that consumes only part of the input:
?- phrase(prefix, Input, _).
The third argument of phrase/3 returns the non-consumed terminals, which you can simply ignore.

Prolog type errors with DCG library functions

I'm trying to write a DCG for a command interface. The idea is to read a string of input, split it on spaces, and hand the resulting list of tokens to a DCG to parse it into a command and arguments. The result of parsing should be a list of terms which I can use with =.. to construct a goal to call. However, I've become really confused by the string type situation in SWI-Prolog (ver. 7.2.3). SWI-Prolog includes a library of basic DCG functionality, including a goal integer//1 which is supposed to parse an integer. It fails due to a type error, but the bigger problem is that I can't figure out how to make a DCG work nicely in SWI-Prolog with "lists of tokens".
Here's what I'm trying to do:
:- use_module(library(dcg/basics)).
% integer//1 is from the dcg/basics lib
amount(X) --> integer(X), { X > 0 }.
cmd([show,all]) --> ["show"],["all"].
cmd([show,false]) --> ["show"].
cmd([skip,X]) --> ["skip"], amount(X).
% now in the interpreter:
?- phrase(cmd(L), ["show","all"]).
L = [show, all].
% what is the problem with this next query?
?- phrase(cmd(L), ["skip", "50"]).
ERROR: code_type/2: Type error: `character' expected, found `"50"' (a string)
I have read Section 5.2 of the SWI manual, but it didn't quite answer my questions:
What type is expected by integer//1 in the dcg/basics library? The error message says "character", but I can't find any useful reference as to what exactly this means and how to provide it with "proper" input.
How do I pass a list of strings (tokens) to phrase/2 such that I can use integer//1 to parse a token as an integer?
If there's no way to use the integer//1 primitive to parse a string of digits into an integer, how should I accomplish this?
I did quite a bit of expermenting with using different values for the double_quote flag in SWI-Prolog, plus different input formats, such as using a list of atoms, using a single string as the input, i.e. "skip 50" rather than ["skip", "50"], and so on, but I feel like there are assumptions about how DCGs work that I don't understand.
I have studied these three pages as well, which have lots of examples but none quite address my issues (some links omitted since I don't have enough reputation to post all of them):
The tutorial "Using Definite Clause Grammars in SWI-Prolog" by Anne Ogborn
A tutorial from Amzi! Prolog about writing command interfaces as DCGs.
Section 7.3 of J. R. Fisher's Prolog tutorial
A third, more broad question is how to generate an error message if an integer is expected but cannot be parsed as one, something like this:
% the user types:
> skip 50x
I didn't understand that number.
One approach is to set the variable X in the DCG above to some kind of error value and then check for that later (like in the hypothetical skip/1 goal that is supposed to get called by the command), but perhaps there's a better/more idiomatic way? Most of my experience in writing parsers comes from using Haskell's Parsec and Attoparsec libraries, which are fairly declarative but work somewhat differently, especially as regards error handling.
Prolog doesn't have strings. The traditional representation of a double quoted character sequence is a list of codes (integers). For efficiency reasons, SWI-Prolog ver. >= 7 introduced strings as new atomic data type:
?- atomic("a string").
true.
and backquoted literals have now the role previously held by strings:
?- X=`123`.
X = [49, 50, 51].
Needless to say, this caused some confusion, also given the weakly typed nature of Prolog...
Anyway, a DCG still works on (difference) lists of character codes, just the translator has been extended to accept strings as terminals. Your code could be
cmd([show,all]) --> whites,"show",whites,"all",blanks_to_nl.
cmd([show,false]) --> whites,"show",blanks_to_nl.
cmd([skip,X]) --> whites,"skip",whites,amount(X),blanks_to_nl.
and can be called like
?- phrase(cmd(C), ` skip 2300 `).
C = [skip, 2300].
edit
how to generate an error message if an integer is expected
I would try:
...
cmd([skip,X]) --> whites,"skip",whites,amount(X).
% integer//1 is from the dcg/basics lib
amount(X) --> integer(X), { X > 0 }, blanks_to_nl, !.
amount(unknown) --> string(S), eos, {print_message(error, invalid_int_arg(S))}.
prolog:message(invalid_int_arg(_)) --> ['I didn\'t understand that number.'].
test:
?- phrase(cmd(C), ` skip 2300x `).
ERROR: I didn't understand that number.
C = [skip, unknown] ;
false.

How do I use this Prolog predicate so as to receive the result? Cannot figure out input

Our textbook gave us this example of a structurer for a math equation in Prolog:
math(Result) --> number(Number1), operator(Operator), number(Number2), { Result = [Number1, Operator, Number2] }.
operator('+') --> ['+'].
number('number') --> ['NUMBER'].
I'm quite new to Prolog, however, and I have no idea how to use this example to get the output. I'm under the impression it restructures the input using Result and outputs it for use.
The only input I've tried that doesn't cause an error is math('number', '+', 'number'). but it always outputs false and I don't know why. Furthermore shouldn't it restructure it and give me the result in Result as well?
What should I be inputting here?
This example is a DCG. You should use the phrase/2 interface predicate to access DCGs.
To find out what the DCG describes, start with the most general query, relating the nonterminal math(R) to a list Ls that is described by the first argument:
?- phrase(math(R), Ls).
From the answer you get (very easy exercise!), you will notice that R is probably not what you meant it to be. Hint: Look up (=..)/2.
Notice in particular that you need not be "inputting" anything here: A DCG describes a list. The list can be specified, but need not be given: A variable will do too! Think in terms of relations between arbitrary terms.

Error in PROLOG code

I am new to prolog.
I want my code in PROLOG to produce the expected output given below. Can some one please tell me where I am going wrong.
The code is basically to remove duplicates and produce o/p in required format.
remove_dups([],_L2,_L2).
remove_dups([A|B],L2,L3) :-
functor(A,Pr,Ar),(member(level(Pr,Ar,1) ,L2) -> remove_dups(B,L2,L2); append([level(Pr,Ar,1)],L2,L3),remove_dups(B,L3,L3)).
expected output:
?- remove_dups([a,b,a],[],L).
L = [level(a,0,1),level(b,0,1)].
For starters I would have preferred to separate the two steps: removal of duplicates and presentation of the levels.
remove_dups([],[]).
remove_dups([X|Xs],Ys) :- member(X,Xs), !, remove_dups(Xs,Ys).
remove_dups([X|Xs],[X|Ys]) :- remove_dups(Xs,Ys).
levels([],[]).
levels([X|Xs],[level(N,A,1)|Ys]):- functor(X,N,A), levels(Xs,Ys).
go(L,R):- remove_dups(L,RL), levels(RL,R).
I have to admit that the constant 1 in the level tripples puzzles me. Are you sure that it should not be somehow more meaningful?
I have also assumed that the order of the list elements is of no importance: remove_dups removes all occurrences of a duplicated element except for the last one. If you would like to keep the first occurrence, remove_dups has to be modified.

Resources