Prolog build rules from atoms - prolog

I'm currently trying to to interpret user-entered strings via Prolog. I'm using code I've found on the internet, which converts a string into a list of atoms.
"Men are stupid." => [men,are,stupid,'.'] % Example
From this I would like to create a rule, which then can be used in the Prolog command-line.
% everyone is a keyword for a rule. If the list doesn't contain 'everyone'
% it's a fact.
% [men,are,stupid]
% should become ...
stupid(men).
% [everyone,who,is,stupid,is,tall]
% should become ...
tall(X) :- stupid(X).
% [everyone,who,is,not,tall,is,green]
% should become ...
green(X) :- not(tall(X)).
% Therefore, this query should return true/yes:
?- green(women).
true.
I don't need anything super fancy for this as my input will always follow a couple of rules and therefore just needs to be analyzed according to these rules.
I've been thinking about this for probably an hour now, but didn't come to anything even considerable, so I can't provide you with what I've tried so far. Can anyone push me into the right direction?

Consider using a DCG. For example:
list_clause(List, Clause) :-
phrase(clause_(Clause), List).
clause_(Fact) --> [X,are,Y], { Fact =.. [Y,X] }.
clause_(Head :- Body) --> [everyone,who,is,B,is,A],
{ Head =.. [A,X], Body =.. [B,X] }.
Examples:
?- list_clause([men,are,stupid], Clause).
Clause = stupid(men).
?- list_clause([everyone,who,is,stupid,is,tall], Clause).
Clause = tall(_G2763):-stupid(_G2763).
I leave the remaining example as an easy exercise.
You can use assertz/1 to assert such clauses dynamically:
?- List = <your list>, list_clause(List, Clause), assertz(Clause).

First of all, you could already during the tokenization step make terms instead of lists, and even directly assert rules into the database. Let's take the "men are stupid" example.
You want to write down something like:
?- assert_rule_from_sentence("Men are stupid.").
and end up with a rule of the form stupid(men).
assert_rule_from_sentence(Sentence) :-
phrase(sentence_to_database, Sentence).
sentence_to_database -->
subject(Subject), " ",
"are", " ",
object(Object), " ",
{ Rule =.. [Object, Subject],
assertz(Rule)
}.
(let's assume you know how to write the DCGs for subject and object)
This is it! Of course, your sentence_to_database//0 will need to have more clauses, or use helper clauses and predicates, but this is at least a start.
As #mat says, it is cleaner to first tokenize and then deal with the tokenized sentence. But then, it would go something like this:
tokenize_sentence(be(Subject, Object)) -->
subject(Subject), space,
be, !,
object(Object), end.
(now you also need to probably define what a space and an end of sentence is...)
be -->
"is".
be -->
"are".
assert_tokenized(be(Subject, Object)) :-
Fact =.. [Object, Subject],
assertz(Fact).
The main reason for doing it this way is that you know during the tokenization what sort of sentence you have: subject - verb - object, or subject - modifier - object - modifier etc, and you can use this information to write your assert_tokenized/1 in a more explicit way.

Definite Clause Grammars are Prolog's go-to tool for translating from strings (such as your English sentences) to Prolog terms (such as the Prolog clauses you want to generate), or the other way around. Here are two introductions I'd recommend:
http://www.learnprolognow.org/lpnpage.php?pagetype=html&pageid=lpn-htmlse29
http://www.pathwayslms.com/swipltuts/dcg/

Related

Automating my pet debugging strategy in SWI-Prolog

I have a very straightforward question I'd be happy to receive any guidance on.
I'm working on a Definite Clause Grammar, and I'm running spot checks on its output. If a parse tree is confusing to me, I want to trace it back to the predicate that generated that part of the tree. So what I do is insert numeric atoms into my predicates. Like so:
sentence(sentence(Subject, Verb, Object)) --> Subject, Verb, Object.
becomes
sentence(sentence(736, Subject, Verb, Object)) --> Subject, Verb, Object.
I can then search for the number 736 and examine that particular predicate to see why it was chosen by Prolog. This has become very handy as my grammar has ballooned in size. But it's inconvenient to have to make these text edits whenever I want to debug.
Is there some elegant Prolog rule I could add to the grammar when I want to debug in this way, something that would attach a unique i.d. to each predicate?
This is highly implementation-specific, but SWI-Prolog has a source_location/2 predicate that, called inside a term_expansion/2 rule, gives you the file name and line number of the clause being expanded.
So you can use something like the following:
term_expansion(Head --> Body, EnhancedHead --> Body) :-
source_location(File, Line),
format('~w --> ~w at ~w:~w~n', [Head, Body, File, Line]),
Head =.. [Functor, Arg1 | Args],
Arg1 =.. [ArgFunctor | ArgArgs],
EnhancedArg1 =.. [ArgFunctor, File:Line | ArgArgs],
EnhancedHead =.. [Functor, EnhancedArg1 | Args].
hello -->
[world].
sentence(sentence(Subject, Verb, Object)) -->
[Subject, Verb, Object].
Note that this term_expansion/2 will print the log message for every -->/2 rule in the program:
hello --> [world] at /home/isabelle/hello.pl:9
sentence(sentence(_2976,_2978,_2980)) --> [_2976,_2978,_2980] at /home/isabelle/hello.pl:12
But it will then fail if the rule's head doesn't have at least one argument, and the first argument doesn't have at least one argument of its own. This is fine, failure just means "don't rewrite this term":
?- listing(hello).
hello([world|A], A).
true.
?- phrase(hello, Hello).
Hello = [world].
But sentence//1 will be rewritten:
?- listing(sentence).
sentence(sentence('/home/isabelle/hello.pl':12, A, B, C), [A, B, C|D], D).
true.
?- phrase(sentence(sentence(Position, S, V, O)), [isabelle, likes, prolog]).
Position = '/home/isabelle/hello.pl':12,
S = isabelle,
V = likes,
O = prolog.
You could build on this, maybe with a separate operator ---> to mark only those rules you really want rewritten. I think having this extra implicit argument is a recipe for lots of unexpected failures when you try to unify something with the actual underlying term, not the term as it appears in the source code.
So maybe a better approach would be something like this:
sentence(sentence(#position, Subject, Verb, Object)) -->
[Subject, Verb, Object].
and a corresponding term_expansion/2 rule that looks for these #position terms and replaces them accordingly.

Getting hold of a variable in complex compound term in Prolog

I have a Prolog sentence parser that returns a sentence (passed into it as a list) split into two parts - a Noun_Phrase and a Verb_Phrase. See example below:
sentence(Sentence, sentence(np(Noun_Phrase), vp(Verb_Phrase))) :-
np(Sentence, Noun_Phrase, Remainder),
vp(Remainder, Verb_Phrase).
Now I want to take the Noun_Phrase and Verb_Phrase and pass them into another Prolog predicate, but first I want to extract the first term from the Verb_Phrase (which should always be a verb) into a variable and the rest of the Verb_Phrase into another one and pass them separately into the next predicate.
I thought about using unification for this and I have tried:
sentence(Sentence, sentence(np(Noun_Phrase), vp(Verb_Phrase))),
[Head|Tail] = Verb_Phrase,
next_predicate(_, Noun_Phrase, Head, Tail, _).
But I am getting ERROR: Out of local stack exception every time. I think this has something to do with the Verb_Phrase not really being a list. This is a possible isntance of Verb_Phrase:
VP = vp(vp(verb(making), adj(quick), np2(noun(improvements))))
How could I get the verb(X) as variable Verb and the rest of the term as varaible Rest out of such compound term in Prolog?
You could use =../2 like:
Verb_Phrase=..[Verb|Rest_Term_list].
Example:
?- noun(improvements)=..[Verb|Rest_Term_list].
Verb = noun,
Rest_Term_list = [improvements].

Parse To Prolog Variables Using DCG

I want to parse a logical expression using DCG in Prolog.
The logical terms are represented as lists e.g. ['x','&&','y'] for x ∧ y the result should be the parse tree and(X,Y) (were X and Y are unassigned Prolog variables).
I implemented it and everything works as expected but I have one problem:
I can't figure out how to parse the variable 'x' and 'y' to get real Prolog variables X and Y for the later assignment of truth values.
I tried the following rule variations:
v(X) --> [X].:
This doesn't work of course, it only returns and('x','y').
But can I maybe uniformly replace the logical variables in this term with Prolog variables? I know of the predicate term_to_atom (which is proposed as a solution for a similar problem) but I don't think it can be used here to achieve the desired result.
v(Y) --> [X], {nonvar(Y)}.:
This does return an unbound variable but of course a new one every time even if the logical variable ('x','y',...) was already in the term so
['X','&&','X'] gets evaluated to and(X,Y) which is not the desired result, either.
Is there any elegant or idiomatic solution to this problem?
Many thanks in advance!
EDIT:
The background to this question is that I'm trying to implement the DPLL-algorithm in Prolog. I thought it would by clever to directly parse the logical term to a Prolog-term to make easy use of the Prolog backtracking facility:
Input: some logical term, e.g T = [x,'&&',y]
Term after parsing: [G_123,'&&',G_456] (now featuring "real" Prolog variables)
Assign a value from { boolean(t), boolean(f) } to the first unbound variable in T.
simplify the term.
... repeat or backtrack until a assignment v is found so that v(T) = t or the search space is depleted.
I'm pretty new to Prolog and honestly couldn't figure out a better approach. I'm very interested in better alternatives! (So I'm kinda half-shure that this is what I want ;-) and thank you very much for your support so far ...)
You want to associate ground terms like x (no need to write 'x') with uninstantiated variables. Certainly that does not constitute a pure relation. So it is not that clear to me that you actually want this.
And where do you get the list [x, &&, x] in the first place? You probably have some kind of tokenizer. If possible, try to associate variable names to variables prior to the actual parsing. If you insist to perform that association during parsing you will have to thread a pair of variables throughout your entire grammar. That is, instead of a clean grammar like
power(P) --> factor(F), power_r(F, P).
you will now have to write
power(P, D0,D) --> factor(F, D0,D1), power_r(F, P, D1,D).
% ^^^^ ^^^^^ ^^^^
since you are introducing context into an otherwise context free grammar.
When parsing Prolog text, the same problem occurs. The association between a variable name and a concrete variable is already established during tokenizing. The actual parser does not have to deal with it.
There are essentially two ways to perform this during tokenization:
1mo collect all occurrences Name=Variable in a list and unify them later:
v(N-V, [N-V|D],D) --> [N], {maybesometest(N)}.
unify_nvs(NVs) :-
keysort(NVs, NVs2),
uniq(NVs2).
uniq([]).
uniq([NV|NVs]) :-
head_eq(NVs, NV).
uniq(NVs).
head_eq([], _).
head_eq([N-V|_],N-V).
head_eq([N1-_|_],N2-_) :-
dif(N1,N2).
2do use some explicit dictionary to merge them early on.
Somewhat related is this question.
Not sure if you really want to do what you asked. You might do it by keeping a list of variable associations so that you would know when to reuse a variable and when to use a fresh one.
This is an example of a greedy descent parser which would parse expressions with && and ||:
parse(Exp, Bindings, NBindings)-->
parseLeaf(LExp, Bindings, MBindings),
parse_cont(Exp, LExp, MBindings, NBindings).
parse_cont(Exp, LExp, Bindings, NBindings)-->
parse_op(Op, LExp, RExp),
{!},
parseLeaf(RExp, Bindings, MBindings),
parse_cont(Exp, Op, MBindings, NBindings).
parse_cont(Exp, Exp, Bindings, Bindings)-->[].
parse_op(and(LExp, RExp), LExp, RExp)--> ['&&'].
parse_op(or(LExp, RExp), LExp, RExp)--> ['||'].
parseLeaf(Y, Bindings, NBindings)-->
[X],
{
(member(bind(X, Var), Bindings)-> Y-NBindings=Var-Bindings ; Y-NBindings=Var-[bind(X, Var)|Bindings])
}.
It parses the expression and returns also the variable bindings.
Sample outputs:
?- phrase(parse(Exp, [], Bindings), ['x', '&&', 'y']).
Exp = and(_G683, _G696),
Bindings = [bind(y, _G696), bind(x, _G683)].
?- phrase(parse(Exp, [], Bindings), ['x', '&&', 'x']).
Exp = and(_G683, _G683),
Bindings = [bind(x, _G683)].
?- phrase(parse(Exp, [], Bindings), ['x', '&&', 'y', '&&', 'x', '||', 'z']).
Exp = or(and(and(_G839, _G852), _G839), _G879),
Bindings = [bind(z, _G879), bind(y, _G852), bind(x, _G839)].

Using "=" in Prolog

I'd like to know why I get an error with my SWI Prolog when I try to do this:
(signal(X) = signal(Y)) :- (terminal(X), terminal(Y), connected(X,Y)).
terminal(X) :- ((signal(X) = 1);(signal(X) = 0)).
I get the following error
Error: trabalho.pro:13: No permission to modify static procedure
'(=)/2'
It doesn't recognize the "=" in the first line, but the second one "compiles". I guess it only accepts the "=" after the :- ? Why?
Will I need to create a predicate like: "equal(x,y) :- (x = y)" for this?
Diedre - there are no 'functions' in Prolog. There are predicates. The usual pattern
goes
name(list of args to be unified) :- body of predicate .
Usually you'd want the thing on the left side of the :- operator to be a predicate
name. when you write
(signal(X) = signal(Y))
= is an operator, so you get
'='(signal(X), signal(Y))
But (we assume, it's not clear what you're doing here) that you don't really want to change equals.
Since '=' is already in the standard library, you can't redefine it (and wouldn't want to)
What you probably want is
equal_signal(X, Y) :- ... bunch of stuff... .
or
equal_signal(signal(X), signal(Y)) :- ... bunch of stuff ... .
This seems like a conceptual error problem. You need to have a conversation with somebody who understands it. I might humbly suggest you pop onto ##prolog on freenode.net or
some similar forum and get somebody to explain it.
Because = is a predefined predicate. What you actually write is (the grounding of terms using the Martelli-Montanari algorithm):
=(signal(X),signal(Y)) :- Foo.
You use predicates like functions in Prolog.
You can define something like:
terminal(X) :- signal(X,1);signal(X,0).
where signal/2 is a predicate that contains a key/value pair.
And:
equal_signal(X,Y) :- terminal(X),terminal(Y),connected(X,Y).

Determine the type of characters

I would like to determine in Prolog the type of a string of characters, if it is alphabetic, alphanumeric or numeric.
For example:
"I use this page" alphabetic
"0c0d24e" alphanumeric
How can i do?
the predicate available is char_type/2, or better, code_type/2.
To apply to each code in string, use maplist/2. The only problem it's the wrong arguments order of code_type. Then a service predicate is needed (or download lambda, if you're using SWI-Prolog, with ?- pack_install(lambda).).
Without lambda:
code_type_(X,Y) :- code_type(Y,X).
?- maplist(code_type_(alpha), "abc").
true.
With lambda:
?- [library(lambda)].
?- maplist(\C^code_type(C,alpha), "abc").
true.
edit after comments, it's apparent that more flexible parsing is required. A DCG it's the recommended way to go: library(dcg/basics) offers some prebuilt 'categorizer', and highlights the proper way to write your own, combining with code_type: for instance, here is a recently added rule:
%% prolog_var_name(-Name:atom)// is semidet.
%
% Matches a Prolog variable name. Primarily intended to deal with
% quasi quotations that embed Prolog variables.
prolog_var_name(Name) -->
[C0], { code_type(C0, prolog_var_start) }, !,
prolog_id_cont(CL),
{ atom_codes(Name, [C0|CL]) }.
prolog_id_cont([H|T]) -->
[H], { code_type(H, prolog_identifier_continue) }, !,
prolog_id_cont(T).
prolog_id_cont([]) --> "".
see how code_type/2 is used to qualify single characters...
more edit - note: untested
qualify_atom(Atom, Type) :-
atom_codes(Atom, Codes),
qualify_codes(Codes, Type).
qualify_codes(Codes, Type) :-
( maplist(code_type_(alnum), Codes)
-> Type = alnum
; maplist(code_type_(alpha), Codes)
-> Type = alpha
; Type = unknown
).
then, to work on a list
?- maplist(qualify_atom, Atoms, Types).
edit
An update of this answer is mandatory: since library(yall) has been released in SWI-Prolog, and is autoloaded, we can now write:
?- maplist([C]>>code_type(C,alpha), `abc`).
Also, note the change in literal representation: double quotes in SWI-Prolog ver.7+ don't represent anymore a list of character codes.

Resources