Getting hold of a variable in complex compound term in Prolog - prolog

I have a Prolog sentence parser that returns a sentence (passed into it as a list) split into two parts - a Noun_Phrase and a Verb_Phrase. See example below:
sentence(Sentence, sentence(np(Noun_Phrase), vp(Verb_Phrase))) :-
np(Sentence, Noun_Phrase, Remainder),
vp(Remainder, Verb_Phrase).
Now I want to take the Noun_Phrase and Verb_Phrase and pass them into another Prolog predicate, but first I want to extract the first term from the Verb_Phrase (which should always be a verb) into a variable and the rest of the Verb_Phrase into another one and pass them separately into the next predicate.
I thought about using unification for this and I have tried:
sentence(Sentence, sentence(np(Noun_Phrase), vp(Verb_Phrase))),
[Head|Tail] = Verb_Phrase,
next_predicate(_, Noun_Phrase, Head, Tail, _).
But I am getting ERROR: Out of local stack exception every time. I think this has something to do with the Verb_Phrase not really being a list. This is a possible isntance of Verb_Phrase:
VP = vp(vp(verb(making), adj(quick), np2(noun(improvements))))
How could I get the verb(X) as variable Verb and the rest of the term as varaible Rest out of such compound term in Prolog?

You could use =../2 like:
Verb_Phrase=..[Verb|Rest_Term_list].
Example:
?- noun(improvements)=..[Verb|Rest_Term_list].
Verb = noun,
Rest_Term_list = [improvements].

Related

Automating my pet debugging strategy in SWI-Prolog

I have a very straightforward question I'd be happy to receive any guidance on.
I'm working on a Definite Clause Grammar, and I'm running spot checks on its output. If a parse tree is confusing to me, I want to trace it back to the predicate that generated that part of the tree. So what I do is insert numeric atoms into my predicates. Like so:
sentence(sentence(Subject, Verb, Object)) --> Subject, Verb, Object.
becomes
sentence(sentence(736, Subject, Verb, Object)) --> Subject, Verb, Object.
I can then search for the number 736 and examine that particular predicate to see why it was chosen by Prolog. This has become very handy as my grammar has ballooned in size. But it's inconvenient to have to make these text edits whenever I want to debug.
Is there some elegant Prolog rule I could add to the grammar when I want to debug in this way, something that would attach a unique i.d. to each predicate?
This is highly implementation-specific, but SWI-Prolog has a source_location/2 predicate that, called inside a term_expansion/2 rule, gives you the file name and line number of the clause being expanded.
So you can use something like the following:
term_expansion(Head --> Body, EnhancedHead --> Body) :-
source_location(File, Line),
format('~w --> ~w at ~w:~w~n', [Head, Body, File, Line]),
Head =.. [Functor, Arg1 | Args],
Arg1 =.. [ArgFunctor | ArgArgs],
EnhancedArg1 =.. [ArgFunctor, File:Line | ArgArgs],
EnhancedHead =.. [Functor, EnhancedArg1 | Args].
hello -->
[world].
sentence(sentence(Subject, Verb, Object)) -->
[Subject, Verb, Object].
Note that this term_expansion/2 will print the log message for every -->/2 rule in the program:
hello --> [world] at /home/isabelle/hello.pl:9
sentence(sentence(_2976,_2978,_2980)) --> [_2976,_2978,_2980] at /home/isabelle/hello.pl:12
But it will then fail if the rule's head doesn't have at least one argument, and the first argument doesn't have at least one argument of its own. This is fine, failure just means "don't rewrite this term":
?- listing(hello).
hello([world|A], A).
true.
?- phrase(hello, Hello).
Hello = [world].
But sentence//1 will be rewritten:
?- listing(sentence).
sentence(sentence('/home/isabelle/hello.pl':12, A, B, C), [A, B, C|D], D).
true.
?- phrase(sentence(sentence(Position, S, V, O)), [isabelle, likes, prolog]).
Position = '/home/isabelle/hello.pl':12,
S = isabelle,
V = likes,
O = prolog.
You could build on this, maybe with a separate operator ---> to mark only those rules you really want rewritten. I think having this extra implicit argument is a recipe for lots of unexpected failures when you try to unify something with the actual underlying term, not the term as it appears in the source code.
So maybe a better approach would be something like this:
sentence(sentence(#position, Subject, Verb, Object)) -->
[Subject, Verb, Object].
and a corresponding term_expansion/2 rule that looks for these #position terms and replaces them accordingly.

Pattern matching using list of characters

I am having difficulty pattern matching words which are converted to lists of characters:
wordworm(H1,H2,H3,V1,V2) :-
word(H1), string_length(H1,7),
word(H2), string_length(H2,5),
word(H3), string_length(H3,4),
word(V1), string_length(V1,4),
word(H3) \= word(V1),
atom_chars(H2, [_,_,Y,_,_]) = atom_chars(V1, [_,_,_,Y]),
word(V2), string_length(V2,5),
word(H2) \= word(V2),
atom_chars(H3, [_,_,_,Y]) = atom_chars(V2, [_,_,_,_,Y]).
Above this section, I have a series of 600 words in the format, word("prolog"). The code runs fine, without the atom_chars, but with it, I get a time-out error. Can anyone suggest a better way for me to structure my code?
Prolog predicate calls are not like function calls in other languages. They do not have "return values".
When you write X = atom_chars(foo, Chars) this does not execute atom_chars. It builds a data structure atom_chars(foo, Chars). It does not "call" this data structure.
If you want to evaluate atom_chars on some atom H2 and then say something about the resulting list, call it like:
atom_chars(H2, H2Chars),
H2Chars = [_,_,Y,_,_]
So overall maybe your code should look more like this:
...,
atom_chars(H2, H2Chars),
H2Chars = [_,_,Y,_,_],
atom_chars(V1, V1Chars),
V1Chars = [_,_,_,Y],
...
Note that you don't need to assert some kind of "equality" between these atom_chars goals. The fact that their char lists share the same variable Y means that there will be a connection: The third character of H2 must be equal to the fourth character of V1.

Checking if all elements of list are same data type (atom I guess). Prolog

I am a total beginner in Prolog. I have some custom types: bird, animal and fish. I want to pass a list to a function like so areSameType([owl, eagle, chicken]). and get a result if the whole list is type bird or animal or fish. For example:
areSameType([owl,giraffe,shark]). > false
areSameType([owl,eagle,chicken]). > true
areSameType([cat,mouse,giraffe]). > true
The data I have inserted is:
bird(owl).
bird(eagle).
bird(chicken).
animal(cat).
animal(mouse).
animal(giraffe).
fish(shark).
fish(magikarp).
fish(gyarados).
I have tried with this function:
isSameType(X,Y):- bird(X),bird(Y);animal(X),animal(Y);fish(X),fish(Y).
areSameType([H1,H2|T]):- isSameType(H1,H2), areSameType([H2,T]).
But the problem is I don't have a criteria to check if H2 is the last element of the list or maybe I got it all wrong with this logic.
There are two problems here with your areSameType/1 predicate:
there is no stop condition: in case the list is empty, or contains exactly one element, than all the items in the list have the same type; and
you need to recurse with [H2|T] (notice the pipe |, instead of the comma ,).
So we can fix this with:
areSameType([]).
areSameType([_]).
areSameType([H1,H2|T]):-
isSameType(H1,H2),
areSameType([H2|T]).
We can however use maplist/2 [swi-doc] here, and write this predicate without recursion (well no recursion in the areSameType/1 predicate itself), like:
areSameType([]).
areSameType([H1|T]) :-
maplist(isSameType(H1), T).

Parse To Prolog Variables Using DCG

I want to parse a logical expression using DCG in Prolog.
The logical terms are represented as lists e.g. ['x','&&','y'] for x ∧ y the result should be the parse tree and(X,Y) (were X and Y are unassigned Prolog variables).
I implemented it and everything works as expected but I have one problem:
I can't figure out how to parse the variable 'x' and 'y' to get real Prolog variables X and Y for the later assignment of truth values.
I tried the following rule variations:
v(X) --> [X].:
This doesn't work of course, it only returns and('x','y').
But can I maybe uniformly replace the logical variables in this term with Prolog variables? I know of the predicate term_to_atom (which is proposed as a solution for a similar problem) but I don't think it can be used here to achieve the desired result.
v(Y) --> [X], {nonvar(Y)}.:
This does return an unbound variable but of course a new one every time even if the logical variable ('x','y',...) was already in the term so
['X','&&','X'] gets evaluated to and(X,Y) which is not the desired result, either.
Is there any elegant or idiomatic solution to this problem?
Many thanks in advance!
EDIT:
The background to this question is that I'm trying to implement the DPLL-algorithm in Prolog. I thought it would by clever to directly parse the logical term to a Prolog-term to make easy use of the Prolog backtracking facility:
Input: some logical term, e.g T = [x,'&&',y]
Term after parsing: [G_123,'&&',G_456] (now featuring "real" Prolog variables)
Assign a value from { boolean(t), boolean(f) } to the first unbound variable in T.
simplify the term.
... repeat or backtrack until a assignment v is found so that v(T) = t or the search space is depleted.
I'm pretty new to Prolog and honestly couldn't figure out a better approach. I'm very interested in better alternatives! (So I'm kinda half-shure that this is what I want ;-) and thank you very much for your support so far ...)
You want to associate ground terms like x (no need to write 'x') with uninstantiated variables. Certainly that does not constitute a pure relation. So it is not that clear to me that you actually want this.
And where do you get the list [x, &&, x] in the first place? You probably have some kind of tokenizer. If possible, try to associate variable names to variables prior to the actual parsing. If you insist to perform that association during parsing you will have to thread a pair of variables throughout your entire grammar. That is, instead of a clean grammar like
power(P) --> factor(F), power_r(F, P).
you will now have to write
power(P, D0,D) --> factor(F, D0,D1), power_r(F, P, D1,D).
% ^^^^ ^^^^^ ^^^^
since you are introducing context into an otherwise context free grammar.
When parsing Prolog text, the same problem occurs. The association between a variable name and a concrete variable is already established during tokenizing. The actual parser does not have to deal with it.
There are essentially two ways to perform this during tokenization:
1mo collect all occurrences Name=Variable in a list and unify them later:
v(N-V, [N-V|D],D) --> [N], {maybesometest(N)}.
unify_nvs(NVs) :-
keysort(NVs, NVs2),
uniq(NVs2).
uniq([]).
uniq([NV|NVs]) :-
head_eq(NVs, NV).
uniq(NVs).
head_eq([], _).
head_eq([N-V|_],N-V).
head_eq([N1-_|_],N2-_) :-
dif(N1,N2).
2do use some explicit dictionary to merge them early on.
Somewhat related is this question.
Not sure if you really want to do what you asked. You might do it by keeping a list of variable associations so that you would know when to reuse a variable and when to use a fresh one.
This is an example of a greedy descent parser which would parse expressions with && and ||:
parse(Exp, Bindings, NBindings)-->
parseLeaf(LExp, Bindings, MBindings),
parse_cont(Exp, LExp, MBindings, NBindings).
parse_cont(Exp, LExp, Bindings, NBindings)-->
parse_op(Op, LExp, RExp),
{!},
parseLeaf(RExp, Bindings, MBindings),
parse_cont(Exp, Op, MBindings, NBindings).
parse_cont(Exp, Exp, Bindings, Bindings)-->[].
parse_op(and(LExp, RExp), LExp, RExp)--> ['&&'].
parse_op(or(LExp, RExp), LExp, RExp)--> ['||'].
parseLeaf(Y, Bindings, NBindings)-->
[X],
{
(member(bind(X, Var), Bindings)-> Y-NBindings=Var-Bindings ; Y-NBindings=Var-[bind(X, Var)|Bindings])
}.
It parses the expression and returns also the variable bindings.
Sample outputs:
?- phrase(parse(Exp, [], Bindings), ['x', '&&', 'y']).
Exp = and(_G683, _G696),
Bindings = [bind(y, _G696), bind(x, _G683)].
?- phrase(parse(Exp, [], Bindings), ['x', '&&', 'x']).
Exp = and(_G683, _G683),
Bindings = [bind(x, _G683)].
?- phrase(parse(Exp, [], Bindings), ['x', '&&', 'y', '&&', 'x', '||', 'z']).
Exp = or(and(and(_G839, _G852), _G839), _G879),
Bindings = [bind(z, _G879), bind(y, _G852), bind(x, _G839)].

Prolog build rules from atoms

I'm currently trying to to interpret user-entered strings via Prolog. I'm using code I've found on the internet, which converts a string into a list of atoms.
"Men are stupid." => [men,are,stupid,'.'] % Example
From this I would like to create a rule, which then can be used in the Prolog command-line.
% everyone is a keyword for a rule. If the list doesn't contain 'everyone'
% it's a fact.
% [men,are,stupid]
% should become ...
stupid(men).
% [everyone,who,is,stupid,is,tall]
% should become ...
tall(X) :- stupid(X).
% [everyone,who,is,not,tall,is,green]
% should become ...
green(X) :- not(tall(X)).
% Therefore, this query should return true/yes:
?- green(women).
true.
I don't need anything super fancy for this as my input will always follow a couple of rules and therefore just needs to be analyzed according to these rules.
I've been thinking about this for probably an hour now, but didn't come to anything even considerable, so I can't provide you with what I've tried so far. Can anyone push me into the right direction?
Consider using a DCG. For example:
list_clause(List, Clause) :-
phrase(clause_(Clause), List).
clause_(Fact) --> [X,are,Y], { Fact =.. [Y,X] }.
clause_(Head :- Body) --> [everyone,who,is,B,is,A],
{ Head =.. [A,X], Body =.. [B,X] }.
Examples:
?- list_clause([men,are,stupid], Clause).
Clause = stupid(men).
?- list_clause([everyone,who,is,stupid,is,tall], Clause).
Clause = tall(_G2763):-stupid(_G2763).
I leave the remaining example as an easy exercise.
You can use assertz/1 to assert such clauses dynamically:
?- List = <your list>, list_clause(List, Clause), assertz(Clause).
First of all, you could already during the tokenization step make terms instead of lists, and even directly assert rules into the database. Let's take the "men are stupid" example.
You want to write down something like:
?- assert_rule_from_sentence("Men are stupid.").
and end up with a rule of the form stupid(men).
assert_rule_from_sentence(Sentence) :-
phrase(sentence_to_database, Sentence).
sentence_to_database -->
subject(Subject), " ",
"are", " ",
object(Object), " ",
{ Rule =.. [Object, Subject],
assertz(Rule)
}.
(let's assume you know how to write the DCGs for subject and object)
This is it! Of course, your sentence_to_database//0 will need to have more clauses, or use helper clauses and predicates, but this is at least a start.
As #mat says, it is cleaner to first tokenize and then deal with the tokenized sentence. But then, it would go something like this:
tokenize_sentence(be(Subject, Object)) -->
subject(Subject), space,
be, !,
object(Object), end.
(now you also need to probably define what a space and an end of sentence is...)
be -->
"is".
be -->
"are".
assert_tokenized(be(Subject, Object)) :-
Fact =.. [Object, Subject],
assertz(Fact).
The main reason for doing it this way is that you know during the tokenization what sort of sentence you have: subject - verb - object, or subject - modifier - object - modifier etc, and you can use this information to write your assert_tokenized/1 in a more explicit way.
Definite Clause Grammars are Prolog's go-to tool for translating from strings (such as your English sentences) to Prolog terms (such as the Prolog clauses you want to generate), or the other way around. Here are two introductions I'd recommend:
http://www.learnprolognow.org/lpnpage.php?pagetype=html&pageid=lpn-htmlse29
http://www.pathwayslms.com/swipltuts/dcg/

Resources