Construct clause step by step - prolog

I would like to construct a clause after a series of steps. For instance, if i verify a condition then i assert a part of a clause. If "the pen is red" i obtain:
color(pen, red)
if "the pen is on the table":
on(pen, table)
if "the table is blue":
color(table, blue)
At the end I have to get:
color(pen, red), on(pen, table), color(table,blue).
I would like to insert the final clause in an external file. How can I do?
EDIT:
I insert a text similar to the above and deduct these separate predicate:
first color(pen,red), second on(pen,table), third color(table,blue)
I would like to obtain a clause that is :
text(t1):- color(pen, red), on(pen, table), color(table,blue).
and this clause must be inserted in a file.
INPUT: single predicate.
OUTPUT: one clause with all predicates.

I'd attack the problem with DCGs.
sentence(S) --> color_statement(S) ; on_statement(S).
det --> [a].
det --> [the].
color_statement(color(Noun, Color)) --> det, [Noun], [is], color(Color).
color(Color) --> [Color], { color(Color) }.
color(red). color(blue).
on_statement(on(Noun, Place)) --> det, [Noun], [is,on], det, [Place].
This is assuming you have some kind of tokenization in place, but for demo purposes, you'll find this "works":
?- phrase(sentence(S), [the,pen,is,on,the,bookshelf]).
S = on(pen, bookshelf).
You will no doubt need to extend these rules for your purposes. You can find tokenization stuff by searching, and only you know exactly the kinds of nouns and modifiers you want to support, so this is really just a sketch of how this could proceed.
From here you're going to create another rule to handle multiple statements.
clause([]) --> [].
clause([S|Rest]) --> sentence(S), ['.'], clause(Rest).
Testing it works like so:
?- phrase(clause(S), [the,pen,is,on,the,bookshelf,'.',the,pen,is,red,'.']).
S = [on(pen, bookshelf), color(pen, red)]
So these are the clauses you want. Now you just need a predicate to bring them together.
list_and([X,Y], (X,Y)).
list_and([X|Xs], (X,Rest)) :- list_and(Xs, Rest).
clause_for(Name, Tokens, Predicate) :-
phrase(clause(Parts), Tokens),
list_and(Parts, AndSequence),
Predicate = (Name :- AndSequence).
This does basically what you want, but you need to furnish a name for your predicate:
?- clause_for(bob, [the,pen,is,on,the,bookshelf,'.',the,pen,is,red,'.'], P).
P = (bob:-on(pen, bookshelf), color(pen, red))
Hope this helps!

Related

Automating my pet debugging strategy in SWI-Prolog

I have a very straightforward question I'd be happy to receive any guidance on.
I'm working on a Definite Clause Grammar, and I'm running spot checks on its output. If a parse tree is confusing to me, I want to trace it back to the predicate that generated that part of the tree. So what I do is insert numeric atoms into my predicates. Like so:
sentence(sentence(Subject, Verb, Object)) --> Subject, Verb, Object.
becomes
sentence(sentence(736, Subject, Verb, Object)) --> Subject, Verb, Object.
I can then search for the number 736 and examine that particular predicate to see why it was chosen by Prolog. This has become very handy as my grammar has ballooned in size. But it's inconvenient to have to make these text edits whenever I want to debug.
Is there some elegant Prolog rule I could add to the grammar when I want to debug in this way, something that would attach a unique i.d. to each predicate?
This is highly implementation-specific, but SWI-Prolog has a source_location/2 predicate that, called inside a term_expansion/2 rule, gives you the file name and line number of the clause being expanded.
So you can use something like the following:
term_expansion(Head --> Body, EnhancedHead --> Body) :-
source_location(File, Line),
format('~w --> ~w at ~w:~w~n', [Head, Body, File, Line]),
Head =.. [Functor, Arg1 | Args],
Arg1 =.. [ArgFunctor | ArgArgs],
EnhancedArg1 =.. [ArgFunctor, File:Line | ArgArgs],
EnhancedHead =.. [Functor, EnhancedArg1 | Args].
hello -->
[world].
sentence(sentence(Subject, Verb, Object)) -->
[Subject, Verb, Object].
Note that this term_expansion/2 will print the log message for every -->/2 rule in the program:
hello --> [world] at /home/isabelle/hello.pl:9
sentence(sentence(_2976,_2978,_2980)) --> [_2976,_2978,_2980] at /home/isabelle/hello.pl:12
But it will then fail if the rule's head doesn't have at least one argument, and the first argument doesn't have at least one argument of its own. This is fine, failure just means "don't rewrite this term":
?- listing(hello).
hello([world|A], A).
true.
?- phrase(hello, Hello).
Hello = [world].
But sentence//1 will be rewritten:
?- listing(sentence).
sentence(sentence('/home/isabelle/hello.pl':12, A, B, C), [A, B, C|D], D).
true.
?- phrase(sentence(sentence(Position, S, V, O)), [isabelle, likes, prolog]).
Position = '/home/isabelle/hello.pl':12,
S = isabelle,
V = likes,
O = prolog.
You could build on this, maybe with a separate operator ---> to mark only those rules you really want rewritten. I think having this extra implicit argument is a recipe for lots of unexpected failures when you try to unify something with the actual underlying term, not the term as it appears in the source code.
So maybe a better approach would be something like this:
sentence(sentence(#position, Subject, Verb, Object)) -->
[Subject, Verb, Object].
and a corresponding term_expansion/2 rule that looks for these #position terms and replaces them accordingly.

Transitive function in DCG

Is it possible to make a transitive function like the following in DCG? Or to combine it with a DCG rule?
genx(A,B) :- gen(A,B).
genx(A,C) :- gen(A,B), genx(B,C).
gen(a,b).
gen(b,c).
I will explain what i'm trying to do exactly :
If i have this grammar:
noun_phrase(D,N) --> det(D), noun(N).
noun(n(cat)) --> [cat].
I want to make some restriction like if i want N in noun(N) to be an animal. So i can use something like this:
noun_phrase(np(D,N)) --> det(D), noun(N), genx(N, animal).
Where the information of a cat is an animal is inferenced from some facts like:
gen(cat,pet).
gen(pet,animal).
Thanks
Not sure to undestand.
If I'm not wrong, from the formal point of view the rules
genx(A,B) :- gen(A,B).
genx(A,C) :- gen(A,B), genx(B,C).
can be written in DCG syntax as
genx --> gen.
genx --> gen, genx.
and, with facts,
gen(a, b).
gen(b, c).
genx(a, c) should return true.
However, the A, B, C in DCG are intended to be list.
I don't know if is reasonable to use DCG (that is intended for parsing) in this way to implement algebrical rules.

Tokenizing a string in Prolog using DCG

Let's say I want to tokenize a string of words (symbols) and numbers separated by whitespaces. For example, the expected result of tokenizing "aa 11" would be [tkSym("aa"), tkNum(11)].
My first attempt was the code below:
whitespace --> [Ws], { code_type(Ws, space) }, whitespace.
whitespace --> [].
letter(Let) --> [Let], { code_type(Let, alpha) }.
symbol([Sym|T]) --> letter(Sym), symbol(T).
symbol([Sym]) --> letter(Sym).
digit(Dg) --> [Dg], { code_type(Dg, digit) }.
digits([Dg|Dgs]) --> digit(Dg), digits(Dgs).
digits([Dg]) --> digit(Dg).
token(tkSym(Token)) --> symbol(Token).
token(tkNum(Token)) --> digits(Digits), { number_chars(Token, Digits) }.
tokenize([Token|Tokens]) --> whitespace, token(Token), tokenize(Tokens).
tokenize([]) --> whitespace, [].
Calling tokenize on "aa bb" leaves me with several possible responses:
?- tokenize(X, "aa bb", []).
X = [tkSym([97|97]), tkSym([98|98])] ;
X = [tkSym([97|97]), tkSym(98), tkSym(98)] ;
X = [tkSym(97), tkSym(97), tkSym([98|98])] ;
X = [tkSym(97), tkSym(97), tkSym(98), tkSym(98)] ;
false.
In this case, however, it seems appropriate to expect only one correct answer. Here's another, more deterministic approach:
whitespace --> [Space], { char_type(Space, space) }, whitespace.
whitespace --> [].
symbol([Sym|T]) --> letter(Sym), !, symbol(T).
symbol([]) --> [].
letter(Let) --> [Let], { code_type(Let, alpha) }.
% similarly for numbers
token(tkSym(Token)) --> symbol(Token).
tokenize([Token|Tokens]) --> whitespace, token(Token), !, tokenize(Tokens).
tokenize([]) --> whiteSpace, [].
But there is a problem: although the single answer to token called on "aa" is now a nice list, the tokenize predicate ends up in an infinite recursion:
?- token(X, "aa", []).
X = tkSym([97, 97]).
?- tokenize(X, "aa", []).
ERROR: Out of global stack
What am I missing? How is the problem usually solved in Prolog?
The underlying problem is that in your second version, token//1 also succeeds for the "empty" token:
?- phrase(token(T), "").
T = tkSym([]).
Therefore, unintentionally, the following succeeds too, as does an arbitrary number of tokens:
?- phrase((token(T1),token(T2)), "").
T1 = T2, T2 = tkSym([]).
To fix this, I recommend you adjust the definitions so that a token must consist of at least one lexical element, as is also typical. A good way to ensure that at least one element is described is to split the DCG rules into two sets. For example, shown for symbol///1:
symbol([L|Ls]) --> letter(L), symbol_r(Ls).
symbol_r([L|Ls]) --> letter(L), symbol_r(Ls).
symbol_r([]) --> [].
This way, you avoid an unbounded recursion that can endlessly consume empty tokens.
Other points:
Always use phrase/2 to access DCGs in a portable way, i.e., independent of the actual implementation method used by any particular Prolog system.
The [] in the final DCG clause is superfluous, you can simply remove it.
Also, avoid using so many !/0. It is OK to commit to the first matching tokenization, but do it only at a single place, like via a once/1 wrapped around the phrase/2 call.
For naming, see my comment above. I recommend to use tokens//1 to make this more declarative. Sample queries, using the above definition of symbol//1:
?- phrase(tokens(Ts), "").
Ts = [].
?- phrase(tokens(Ls), "a").
Ls = [tkSym([97])].
?- phrase(tokens(Ls), "a b").
Ls = [tkSym([97]), tkSym([98])].

Prolog build rules from atoms

I'm currently trying to to interpret user-entered strings via Prolog. I'm using code I've found on the internet, which converts a string into a list of atoms.
"Men are stupid." => [men,are,stupid,'.'] % Example
From this I would like to create a rule, which then can be used in the Prolog command-line.
% everyone is a keyword for a rule. If the list doesn't contain 'everyone'
% it's a fact.
% [men,are,stupid]
% should become ...
stupid(men).
% [everyone,who,is,stupid,is,tall]
% should become ...
tall(X) :- stupid(X).
% [everyone,who,is,not,tall,is,green]
% should become ...
green(X) :- not(tall(X)).
% Therefore, this query should return true/yes:
?- green(women).
true.
I don't need anything super fancy for this as my input will always follow a couple of rules and therefore just needs to be analyzed according to these rules.
I've been thinking about this for probably an hour now, but didn't come to anything even considerable, so I can't provide you with what I've tried so far. Can anyone push me into the right direction?
Consider using a DCG. For example:
list_clause(List, Clause) :-
phrase(clause_(Clause), List).
clause_(Fact) --> [X,are,Y], { Fact =.. [Y,X] }.
clause_(Head :- Body) --> [everyone,who,is,B,is,A],
{ Head =.. [A,X], Body =.. [B,X] }.
Examples:
?- list_clause([men,are,stupid], Clause).
Clause = stupid(men).
?- list_clause([everyone,who,is,stupid,is,tall], Clause).
Clause = tall(_G2763):-stupid(_G2763).
I leave the remaining example as an easy exercise.
You can use assertz/1 to assert such clauses dynamically:
?- List = <your list>, list_clause(List, Clause), assertz(Clause).
First of all, you could already during the tokenization step make terms instead of lists, and even directly assert rules into the database. Let's take the "men are stupid" example.
You want to write down something like:
?- assert_rule_from_sentence("Men are stupid.").
and end up with a rule of the form stupid(men).
assert_rule_from_sentence(Sentence) :-
phrase(sentence_to_database, Sentence).
sentence_to_database -->
subject(Subject), " ",
"are", " ",
object(Object), " ",
{ Rule =.. [Object, Subject],
assertz(Rule)
}.
(let's assume you know how to write the DCGs for subject and object)
This is it! Of course, your sentence_to_database//0 will need to have more clauses, or use helper clauses and predicates, but this is at least a start.
As #mat says, it is cleaner to first tokenize and then deal with the tokenized sentence. But then, it would go something like this:
tokenize_sentence(be(Subject, Object)) -->
subject(Subject), space,
be, !,
object(Object), end.
(now you also need to probably define what a space and an end of sentence is...)
be -->
"is".
be -->
"are".
assert_tokenized(be(Subject, Object)) :-
Fact =.. [Object, Subject],
assertz(Fact).
The main reason for doing it this way is that you know during the tokenization what sort of sentence you have: subject - verb - object, or subject - modifier - object - modifier etc, and you can use this information to write your assert_tokenized/1 in a more explicit way.
Definite Clause Grammars are Prolog's go-to tool for translating from strings (such as your English sentences) to Prolog terms (such as the Prolog clauses you want to generate), or the other way around. Here are two introductions I'd recommend:
http://www.learnprolognow.org/lpnpage.php?pagetype=html&pageid=lpn-htmlse29
http://www.pathwayslms.com/swipltuts/dcg/

reading files in Prolog

I am trying to read the file 'nouns.csv' which looks like this in notepad:
femin,femin,1,f,woman,women.
aqu,aqu,1,f,water,waters.
I have tried using this in SWI-Prolog:
18 ?- read(X).
X = end_of_file.
19 ?- see('nouns.csv').
true.
20 ?- seeing(X).
X = <stream>(000000000017F770).
21 ?- read(X).
X = end_of_file.
22 ?-
I have absolutely no experience with Input and Output (in any language) and I am confused. I thought read(X) would return the whole file as a string (which I thought is a stream). I want to read in each line and apply this predicate to it:
nounstage(Input) -->
["noun("],
[Adjust],
[")."],
{
append(Adjust,[46],Input)
}.
nounlineparse(X,Y) :-
phrase(nounstage(X),N),
flatten(N,Y),
asserta(Y).
I presumed I would make a giant list of every line in the nouns.csv, then iterate through the list and apply nounlineparse to each element. How can I get my file/stream into this giant list (or is the giant list way a bad way of doing this?).
In SWI-Prolog, the easier route is library(csv).
If you need to do specific processing on read items, library(pure_input) can help you.
The basic skeleton is like
:- use_module(library(pure_input)).
:- use_module(library(dcg/basics)).
read_csv(Path, Records) :-
phrase_from_file(read_csv_file(Records), Path).
read_csv_file([Tokens|Lines]) -->
read_csv_line(Tokens), read_csv_file(Lines).
read_csv_file([]) --> [].
read_csv_line([Token]) -->
string(TokenS), ( ".\n" ; "." ), {atom_codes(Token, TokenS)} .
read_csv_line([Token|Tokens]) -->
string(TokenS), ",", {atom_codes(Token, TokenS)}, !, read_csv_line(Tokens).
You will need to pay much attention to details...

Resources