Removing left recursion in DCG - Prolog - prolog

I've got a small problem with left recursion in this grammar. I'm trying to write it in Prolog, but I don't know how to remove left recursion.
<expression> -> <simple_expression>
<simple_expression> -> <simple_expression> <binary_operator> <simple_expression>
<simple_expression> -> <function>
<function> -> <function> <atom>
<function> -> <atom>
<atom> -> <number> | <variable>
<binary_operator> -> + | - | * | /
expression(Expr) --> simple_expression(SExpr), { Expr = SExpr }.
simple_expression(SExpr) --> simple_expression(SExpr1), binary_operator(Op), simple_expression(SExpr2), { SExpr =.. [Op, SExpr1, SExpr2] }.
simple_expression(SExpr) --> function(Func), { SExpr = Func }.
function(Func) --> function(Func2), atom(At), { Func = [Func2, atom(At)] }.
function(Func) --> atom(At), { Func = At }.
I've written something like that, but it won't work at all. How to change it to get this program working?

The problem with your program is indeed left recursion; it should be removed otherwise you'll get stuck in an infinite loop
To remove immediate left recursion you replace each rule of the form
A->A a1|A a2|....|b1|b2|....
with:
A -> b1 A'|b2 A'|....
A' -> ε | a1 A'| a2 A'|....
so function would be
function -> atom, functionR.
funtionR -> [].
wiki page

The problem only arises since you are using backward chaining. In forward chaining it is possible to deal with left recursive grammar rules directly. Provided the grammar rules of the form:
NT ==> NT'
Don't form a cycle. You can also use auxiliary computations, i.e. the {}/1, if you place them after the non-terminals of the body and if the non-terminals in the head don't have parameters exclusively going into the auxiliary computations. i.e. the bottom-up condition.
Here is an example left recursive grammar that works perfectly this way in forward chaining:
:- use_module(library(minimal/chart)).
:- use_module(library(experiment/ref)).
:- static 'D'/3.
expr(C) ==> expr(A), [+], term(B), {C is A+B}.
expr(C) ==> expr(A), [-], term(B), {C is A-B}.
expr(A) ==> term(A).
term(C) ==> term(A), [*], factor(B), {C is A*B}.
term(C) ==> term(A), [/], factor(B), {C is A/B}.
term(A) ==> factor(A).
factor(A) ==> [A], {integer(A)}.
Here is a link to the source code of the chart parser. From this link the source code of the forward chainer can be also found. In the following an example session is shown:
?- use_module(library(minimal/hypo)).
?- chart([1,+,2,*,3], N) => chart(expr(X), N).
X = 7
During parsing the chart parser will fill a chart in a bottom up fashion. For each non-terminal p/n in the above productions there will be facts p/n+2. Here is the result of the chart for the above example:
:- thread_local factor/3.
factor(3, 4, 5).
factor(2, 2, 3).
factor(1, 0, 1).
:- thread_local term/3.
term(3, 4, 5).
term(2, 2, 3).
term(6, 2, 5).
term(1, 0, 1).
:- thread_local expr/3.
expr(3, 4, 5).
expr(2, 2, 3).
expr(6, 2, 5).
expr(1, 0, 1).
expr(3, 0, 3).
expr(7, 0, 5).

The answer from #thanosQR is fairly good, but applies to a more general context than DCG, and requires a change in the Parse Tree. Effectively, the 'outcome' of parsing has been removed, that's not good.
If you are interested just in parsing expressions, I posted here something useful.

Answer set programming (ASP) provides another route to implement grammars.
ASP can be implemented with non-deterministic forward chaining and this is what
our library(minimal/asp) provides. The result of ASP are then different models
of the given rules. We use here ASP models to represent a Cocke-Younger-Kasami
chart. We begin our chart with the given words we want to parse which are
represented by word/3 facts. Compared to DCG we do not pass around anymore
lists but instead word positions. The Prolog text calc2.p shows such an implementation
of an ASP based parser. All rules are now (<=)/2 rules, means they are forward
chaining rules. And all heads are now choose/1 heads, means they make a ASP
model choice. We explain how expr is realized, the term is realized similary.
Since we do not have an automatic translation, we did the translation manually.
We will provide the words from right to left and only trigger at the beginning
of each attributed grammar rule:
choose([expr(C, I, O)]) <= posted(expr(A, I, H)), word('+', H, J), term(B, J, O), C is A+B.
choose([expr(C, I, O)]) <= posted(expr(A, I, H)), word('-', H, J), term(B, J, O), C is A-B.
choose([expr(B, I, O)]) <= posted(word('-', I, H)), term(A, H, O), B is -A.
choose([expr(A, I, O)]) <= posted(term(A, I, O)).
As can be seen no extra predicate expr_rest was needed and the translation from
grammar to rules was 1-1. The same happens for term. The execution of such a grammar
requires that first the words are posted from right to left, and the result can
then be read off from the corresponding non-terminal:
?- post(word(78,7,8)), post(word('+',6,7)), post(word(56,5,6)), post(word('*',4,5)),
post(word(34,3,4)), post(word('+',2,3)), post(word(12,1,2)), post(word('-',0,1)),
expr(X,0,8).
X = 1970
We have also made a Prolog text show.p which allows visualizing the ASP model as
a parsing chart. We simply use the common triangular matrix representation. The
parsing chart for the above arithmetic expression has looks as follows:
Peter Schüller (2018) - Answer Set Programming in Linguistics
https://peterschueller.com//pub/2018/2018-schueller-asp-linguistics.pdf
User Manual - Module "asp"
http://www.jekejeke.ch/idatab/doclet/prod/en/docs/15_min/10_docu/02_reference/07_theory/01_minimal/06_asp.html

Related

Fold over a partial list

This is a question provoked by an already deleted answer to this question. The issue could be summarized as follows:
Is it possible to fold over a list, with the tail of the list generated while folding?
Here is what I mean. Say I want to calculate the factorial (this is a silly example but it is just for demonstration), and decide to do it like this:
fac_a(N, F) :-
must_be(nonneg, N),
( N =< 1
-> F = 1
; numlist(2, N, [H|T]),
foldl(multiplication, T, H, F)
).
multiplication(X, Y, Z) :-
Z is Y * X.
Here, I need to generate the list that I give to foldl. However, I could do the same in constant memory (without generating the list and without using foldl):
fac_b(N, F) :-
must_be(nonneg, N),
( N =< 1
-> F = 1
; fac_b_1(2, N, 2, F)
).
fac_b_1(X, N, Acc, F) :-
( X < N
-> succ(X, X1),
Acc1 is X1 * Acc,
fac_b_1(X1, N, Acc1, F)
; Acc = F
).
The point here is that unlike the solution that uses foldl, this uses constant memory: no need for generating a list with all values!
Calculating a factorial is not the best example, but it is easier to follow for the stupidity that comes next.
Let's say that I am really afraid of loops (and recursion), and insist on calculating the factorial using a fold. I still would need a list, though. So here is what I might try:
fac_c(N, F) :-
must_be(nonneg, N),
( N =< 1
-> F = 1
; foldl(fac_foldl(N), [2|Back], 2-Back, F-[])
).
fac_foldl(N, X, Acc-Back, F-Rest) :-
( X < N
-> succ(X, X1),
F is Acc * X1,
Back = [X1|Rest]
; Acc = F,
Back = []
).
To my surprise, this works as intended. I can "seed" the fold with an initial value at the head of a partial list, and keep on adding the next element as I consume the current head. The definition of fac_foldl/4 is almost identical to the definition of fac_b_1/4 above: the only difference is that the state is maintained differently. My assumption here is that this should use constant memory: is that assumption wrong?
I know this is silly, but it could however be useful for folding over a list that cannot be known when the fold starts. In the original question we had to find a connected region, given a list of x-y coordinates. It is not enough to fold over the list of x-y coordinates once (you can however do it in two passes; note that there is at least one better way to do it, referenced in the same Wikipedia article, but this also uses multiple passes; altogether, the multiple-pass algorithms assume constant-time access to neighboring pixels!).
My own solution to the original "regions" question looks something like this:
set_region_rest([A|As], Region, Rest) :-
sort([A|As], [B|Bs]),
open_set_closed_rest([B], Bs, Region0, Rest),
sort(Region0, Region).
open_set_closed_rest([], Rest, [], Rest).
open_set_closed_rest([X-Y|As], Set, [X-Y|Closed0], Rest) :-
X0 is X-1, X1 is X + 1,
Y0 is Y-1, Y1 is Y + 1,
ord_intersection([X0-Y,X-Y0,X-Y1,X1-Y], Set, New, Set0),
append(New, As, Open),
open_set_closed_rest(Open, Set0, Closed0, Rest).
Using the same "technique" as above, we can twist this into a fold:
set_region_rest_foldl([A|As], Region, Rest) :-
sort([A|As], [B|Bs]),
foldl(region_foldl, [B|Back],
closed_rest(Region0, Bs)-Back,
closed_rest([], Rest)-[]),
!,
sort(Region0, Region).
region_foldl(X-Y,
closed_rest([X-Y|Closed0], Set)-Back,
closed_rest(Closed0, Set0)-Back0) :-
X0 is X-1, X1 is X + 1,
Y0 is Y-1, Y1 is Y + 1,
ord_intersection([X0-Y,X-Y0,X-Y1,X1-Y], Set, New, Set0),
append(New, Back0, Back).
This also "works". The fold leaves behind a choice point, because I haven't articulated the end condition as in fac_foldl/4 above, so I need a cut right after it (ugly).
The Questions
Is there a clean way of closing the list and removing the cut? In the factorial example, we know when to stop because we have additional information; however, in the second example, how do we notice that the back of the list should be the empty list?
Is there a hidden problem I am missing?
This looks like its somehow similar to the Implicit State with DCGs, but I have to admit I never quite got how that works; are these connected?
You are touching on several extremely interesting aspects of Prolog, each well worth several separate questions on its own. I will provide a high-level answer to your actual questions, and hope that you post follow-up questions on the points that are most interesting to you.
First, I will trim down the fragment to its essence:
essence(N) :-
foldl(essence_(N), [2|Back], Back, _).
essence_(N, X0, Back, Rest) :-
( X0 #< N ->
X1 #= X0 + 1,
Back = [X1|Rest]
; Back = []
).
Note that this prevents the creation of extremely large integers, so that we can really study the memory behaviour of this pattern.
To your first question: Yes, this runs in O(1) space (assuming constant space for arising integers).
Why? Because although you continuously create lists in Back = [X1|Rest], these lists can all be readily garbage collected because you are not referencing them anywhere.
To test memory aspects of your program, consider for example the following query, and limit the global stack of your Prolog system so that you can quickly detect growing memory by running out of (global) stack:
?- length(_, E),
N #= 2^E,
portray_clause(N),
essence(N),
false.
This yields:
1.
2.
...
8388608.
16777216.
etc.
It would be completely different if you referenced the list somewhere. For example:
essence(N) :-
foldl(essence_(N), [2|Back], Back, _),
Back = [].
With this very small change, the above query yields:
?- length(_, E),
N #= 2^E,
portray_clause(N),
essence(N),
false.
1.
2.
...
1048576.
ERROR: Out of global stack
Thus, whether a term is referenced somewhere can significantly influence the memory requirements of your program. This sounds quite frightening, but really is hardly an issue in practice: You either need the term, in which case you need to represent it in memory anyway, or you don't need the term, in which case it is simply no longer referenced in your program and becomes amenable to garbage collection. In fact, the amazing thing is rather that GC works so well in Prolog also for quite complex programs that not much needs to be said about it in many situations.
On to your second question: Clearly, using (->)/2 is almost always highly problematic in that it limits you to a particular direction of use, destroying the generality we expect from logical relations.
There are several solutions for this. If your CLP(FD) system supports zcompare/3 or a similar feature, you can write essence_/3 as follows:
essence_(N, X0, Back, Rest) :-
zcompare(C, X0, N),
closing(C, X0, Back, Rest).
closing(<, X0, [X1|Rest], Rest) :- X1 #= X0 + 1.
closing(=, _, [], _).
Another very nice meta-predicate called if_/3 was recently introduced in Indexing dif/2 by Ulrich Neumerkel and Stefan Kral. I leave implementing this with if_/3 as a very worthwhile and instructive exercise. Discussing this is well worth its own question!
On to the third question: How do states with DCGs relate to this? DCG notation is definitely useful if you want to pass around a global state to several predicates, where only a few of them need to access or modify the state, and most of them simply pass the state through. This is completely analogous to monads in Haskell.
The "normal" Prolog solution would be to extend each predicate with 2 arguments to describe the relation between the state before the call of the predicate, and the state after it. DCG notation lets you avoid this hassle.
Importantly, using DCG notation, you can copy imperative algorithms almost verbatim to Prolog, without the hassle of introducing many auxiliary arguments, even if you need global states. As an example for this, consider a fragment of Tarjan's strongly connected components algorithm in imperative terms:
function strongconnect(v)
// Set the depth index for v to the smallest unused index
v.index := index
v.lowlink := index
index := index + 1
S.push(v)
This clearly makes use of a global stack and index, which ordinarily would become new arguments that you need to pass around in all your predicates. Not so with DCG notation! For the moment, assume that the global entities are simply easily accessible, and so you can code the whole fragment in Prolog as:
scc_(V) -->
vindex_is_index(V),
vlowlink_is_index(V),
index_plus_one,
s_push(V),
This is a very good candidate for its own question, so consider this a teaser.
At last, I have a general remark: In my view, we are only at the beginning of finding a series of very powerful and general meta-predicates, and the solution space is still largely unexplored. call/N, maplist/[3,4], foldl/4 and other meta-predicates are definitely a good start. if_/3 has the potential to combine good performance with the generality we expect from Prolog predicates.
If your Prolog implementation supports freeze/2 or similar predicate (e.g. Swi-Prolog), then you can use following approach:
fac_list(L, N, Max) :-
(N >= Max, L = [Max], !)
;
freeze(L, (
L = [N|Rest],
N2 is N + 1,
fac_list(Rest, N2, Max)
)).
multiplication(X, Y, Z) :-
Z is Y * X.
factorial(N, Factorial) :-
fac_list(L, 1, N),
foldl(multiplication, L, 1, Factorial).
Example above first defines a predicate (fac_list) which creates a "lazy" list of increasing integer values starting from N up to maximum value (Max), where next list element is generated only after previous one was "accessed" (more on that below). Then, factorial just folds multiplication over lazy list, resulting in constant memory usage.
The key to understanding how this example works is remembering that Prolog lists are, in fact, just terms of arity 2 with name '.' (actually, in Swi-Prolog 7 the name was changed, but this is not important for this discussion), where first element represents list item and the second element represents tail (or terminating element - empty list, []). For example. [1, 2, 3] can be represented as:
.(1, .(2, .(3, [])))
Then, freeze is defined as follows:
freeze(+Var, :Goal)
Delay the execution of Goal until Var is bound
This means if we call:
freeze(L, L=[1|Tail]), L = [A|Rest].
then following steps will happen:
freeze(L, L=[1|Tail]) is called
Prolog "remembers" that when L will be unified with "anything", it needs to call L=[1|Tail]
L = [A|Rest] is called
Prolog unifies L with .(A, Rest)
This unification triggers execution of L=[1|Tail]
This, obviously, unifies L, which at this point is bound to .(A, Rest), with .(1, Tail)
As a result, A gets unified with 1.
We can extend this example as follows:
freeze(L1, L1=[1|L2]),
freeze(L2, L2=[2|L3]),
freeze(L3, L3=[3]),
L1 = [A|R2], % L1=[1|L2] is called at this point
R2 = [B|R3], % L2=[2|L3] is called at this point
R3 = [C]. % L3=[3] is called at this point
This works exactly like the previous example, except that it gradually generates 3 elements, instead of 1.
As per Boris's request, the second example implemented using freeze. Honestly, I'm not quite sure whether this answers the question, as the code (and, IMO, the problem) is rather contrived, but here it is. At least I hope this will give other people the idea what freeze might be useful for. For simplicity, I am using 1D problem instead of 2D, but changing the code to use 2 coordinates should be rather trivial.
The general idea is to have (1) function that generates new Open/Closed/Rest/etc. state based on previous one, (2) "infinite" list generator which can be told to "stop" generating new elements from the "outside", and (3) fold_step function which folds over "infinite" list, generating new state on each list item and, if that state is considered to be the last one, tells generator to halt.
It is worth to note that list's elements are used for no other reason but to inform generator to stop. All calculation state is stored inside accumulator.
Boris, please clarify whether this gives a solution to your problem. More precisely, what kind of data you were trying to pass to fold step handler (Item, Accumulator, Next Accumulator)?
adjacent(X, Y) :-
succ(X, Y) ;
succ(Y, X).
state_seq(State, L) :-
(State == halt -> L = [], !)
;
freeze(L, (
L = [H|T],
freeze(H, state_seq(H, T))
)).
fold_step(Item, Acc, NewAcc) :-
next_state(Acc, NewAcc),
NewAcc = _:_:_:NewRest,
(var(NewRest) ->
Item = next ;
Item = halt
).
next_state(Open:Set:Region:_Rest, NewOpen:NewSet:NewRegion:NewRest) :-
Open = [],
NewOpen = Open,
NewSet = Set,
NewRegion = Region,
NewRest = Set.
next_state(Open:Set:Region:Rest, NewOpen:NewSet:NewRegion:NewRest) :-
Open = [H|T],
partition(adjacent(H), Set, Adjacent, NotAdjacent),
append(Adjacent, T, NewOpen),
NewSet = NotAdjacent,
NewRegion = [H|Region],
NewRest = Rest.
set_region_rest(Ns, Region, Rest) :-
Ns = [H|T],
state_seq(next, L),
foldl(fold_step, L, [H]:T:[]:_, _:_:Region:Rest).
One fine improvement to the code above would be making fold_step a higher order function, passing it next_state as the first argument.

Counter-intuitive behavior of min_member/2

min_member(-Min, +List)
True when Min is the smallest member in the standard order of terms. Fails if List is empty.
?- min_member(3, [1,2,X]).
X = 3.
The explanation is of course that variables come before all other terms in the standard order of terms, and unification is used. However, the reported solution feels somehow wrong.
How can it be justified? How should I interpret this solution?
EDIT:
One way to prevent min_member/2 from succeeding with this solution is to change the standard library (SWI-Prolog) implementation as follows:
xmin_member(Min, [H|T]) :-
xmin_member_(T, H, Min).
xmin_member_([], Min0, Min) :-
( var(Min0), nonvar(Min)
-> fail
; Min = Min0
).
xmin_member_([H|T], Min0, Min) :-
( H #>= Min0
-> xmin_member_(T, Min0, Min)
; xmin_member_(T, H, Min)
).
The rationale behind failing instead of throwing an instantiation error (what #mat suggests in his answer, if I understood correctly) is that this is a clear question:
"Is 3 the minimum member of [1,2,X], when X is a free variable?"
and the answer to this is (to me at least) a clear "No", rather than "I can't really tell."
This is the same class of behavior as sort/2:
?- sort([A,B,C], [3,1,2]).
A = 3,
B = 1,
C = 2.
And the same tricks apply:
?- min_member(3, [1,2,A,B]).
A = 3.
?- var(B), min_member(3, [1,2,A,B]).
B = 3.
The actual source of confusion is a common problem with general Prolog code. There is no clean, generally accepted classification of the kind of purity or impurity of a Prolog predicate. In a manual, and similarly in the standard, pure and impure built-ins are happily mixed together. For this reason, things are often confused, and talking about what should be the case and what not, often leads to unfruitful discussions.
How can it be justified? How should I interpret this solution?
First, look at the "mode declaration" or "mode indicator":
min_member(-Min, +List)
In the SWI documentation, this describes the way how a programmer shall use this predicate. Thus, the first argument should be uninstantiated (and probably also unaliased within the goal), the second argument should be instantiated to a list of some sort. For all other uses you are on your own. The system assumes that you are able to check that for yourself. Are you really able to do so? I, for my part, have quite some difficulties with this. ISO has a different system which also originates in DEC10.
Further, the implementation tries to be "reasonable" for unspecified cases. In particular, it tries to be steadfast in the first argument. So the minimum is first computed independently of the value of Min. Then, the resulting value is unified with Min. This robustness against misuses comes often at a price. In this case, min_member/2 always has to visit the entire list. No matter if this is useful or not. Consider
?- length(L, 1000000), maplist(=(1),L), min_member(2, L).
Clearly, 2 is not the minimum of L. This could be detected by considering the first element of the list only. Due to the generality of the definition, the entire list has to be visited.
This way of handling output unification is similarly handled in the standard. You can spot those cases when the (otherwise) declarative description (which is the first of a built-in) explicitly refers to unification, like
8.5.4 copy_term/2
8.5.4.1 Description
copy_term(Term_1, Term_2) is true iff Term_2 unifies
with a term T which is a renamed copy (7.1.6.2) of
Term_1.
or
8.4.3 sort/2
8.4.3.1 Description
sort(List, Sorted) is true iff Sorted unifies with
the sorted list of List (7.1.6.5).
Here are those arguments (in brackets) of built-ins that can only be understood as being output arguments. Note that there are many more which effectively are output arguments, but that do not need the process of unification after some operation. Think of 8.5.2 arg/3 (3) or 8.2.1 (=)/2 (2) or (1).
8.5.4 1 copy_term/2 (2),
8.4.2 compare/3 (1),
8.4.3 sort/2 (2),
8.4.4 keysort/2 (2),
8.10.1 findall/3 (3),
8.10.2 bagof/3 (3),
8.10.3 setof/3 (3).
So much for your direct questions, there are some more fundamental problems behind:
Term order
Historically, "standard" term order1 has been defined to permit the definition of setof/3 and sort/2 about 1982. (Prior to it, as in 1978, it was not mentioned in the DEC10 manual user's guide.)
From 1982 on, term order was frequently (erm, ab-) used to implement other orders, particularly, because DEC10 did not offer higher-order predicates directly. call/N was to be invented two years later (1984) ; but needed some more decades to be generally accepted. It is for this reason that Prolog programmers have a somewhat nonchalant attitude towards sorting. Often they intend to sort terms of a certain kind, but prefer to use sort/2 for this purpose — without any additional error checking. A further reason for this was the excellent performance of sort/2 beating various "efficient" libraries in other programming languages decades later (I believe STL had a bug to this end, too). Also the complete magic in the code - I remember one variable was named Omniumgatherum - did not invite copying and modifying the code.
Term order has two problems: variables (which can be further instantiated to invalidate the current ordering) and infinite terms. Both are handled in current implementations without producing an error, but with still undefined results. Yet, programmers assume that everything will work out. Ideally, there would be comparison predicates that produce
instantiation errors for unclear cases like this suggestion. And another error for incomparable infinite terms.
Both SICStus and SWI have min_member/2, but only SICStus has min_member/3 with an additional argument to specify the order employed. So the goal
?- min_member(=<, M, Ms).
behaves more to your expectations, but only for numbers (plus arithmetic expressions).
Footnotes:
1 I quote standard, in standard term order, for this notion existed since about 1982 whereas the standard was published 1995.
Clearly min_member/2 is not a true relation:
?- min_member(X, [X,0]), X = 1.
X = 1.
yet, after simply exchanging the two goals by (highly desirable) commutativity of conjunction, we get:
?- X = 1, min_member(X, [X,0]).
false.
This is clearly quite bad, as you correctly observe.
Constraints are a declarative solution for such problems. In the case of integers, finite domain constraints are a completely declarative solution for such problems.
Without constraints, it is best to throw an instantiation error when we know too little to give a sound answer.
This is a common property of many (all?) predicates that depend on the standard order of terms, while the order between two terms can change after unification. Baseline is the conjunction below, which cannot be reverted either:
?- X #< 2, X = 3.
X = 3.
Most predicates using a -Value annotation for an argument say that pred(Value) is the same
as pred(Var), Value = Var. Here is another example:
?- sort([2,X], [3,2]).
X = 3.
These predicates only represent clean relations if the input is ground. It is too much to demand the input to be ground though because they can be meaningfully used with variables, as long as the user is aware that s/he should not further instantiate any of the ordered terms. In that sense, I disagree with #mat. I do agree that constraints can surely make some of these relations sound.
This is how min_member/2 is implemented:
min_member(Min, [H|T]) :-
min_member_(T, H, Min).
min_member_([], Min, Min).
min_member_([H|T], Min0, Min) :-
( H #>= Min0
-> min_member_(T, Min0, Min)
; min_member_(T, H, Min)
).
So it seems that min_member/2 actually tries to unify Min (the first argument) with the smallest element in List in the standard order of terms.
I hope I am not off-topic with this third answer. I did not edit one of the previous two as I think it's a totally different idea. I was wondering if this undesired behaviour:
?- min_member(X, [A, B]), A = 3, B = 2.
X = A, A = 3,
B = 2.
can be avoided if some conditions can be postponed for the moment when A and B get instantiated.
promise_relation(Rel_2, X, Y):-
call(Rel_2, X, Y),
when(ground(X), call(Rel_2, X, Y)),
when(ground(Y), call(Rel_2, X, Y)).
min_member_1(Min, Lst):-
member(Min, Lst),
maplist(promise_relation(#=<, Min), Lst).
What I want from min_member_1(?Min, ?Lst) is to expresses a relation that says Min will always be lower (in the standard order of terms) than any of the elements in Lst.
?- min_member_1(X, L), L = [_,2,3,4], X = 1.
X = 1,
L = [1, 2, 3, 4] .
If variables get instantiated at a later time, the order in which they get bound becomes important as a comparison between a free variable and an instantiated one might be made.
?- min_member_1(X, [A,B,C]), B is 3, C is 4, A is 1.
X = A, A = 1,
B = 3,
C = 4 ;
false.
?- min_member_1(X, [A,B,C]), A is 1, B is 3, C is 4.
false.
But this can be avoided by unifying all of them in the same goal:
?- min_member_1(X, [A,B,C]), [A, B, C] = [1, 3, 4].
X = A, A = 1,
B = 3,
C = 4 ;
false.
Versions
If the comparisons are intended only for instantiated variables, promise_relation/3 can be changed to check the relation only when both variables get instantiated:
promise_relation(Rel_2, X, Y):-
when((ground(X), ground(Y)), call(Rel_2, X, Y)).
A simple test:
?- L = [_, _, _, _], min_member_1(X, L), L = [3,4,1,2].
L = [3, 4, 1, 2],
X = 1 ;
false.
! Edits were made to improve the initial post thanks to false's comments and suggestions.
I have an observation regarding your xmin_member implementation. It fails on this query:
?- xmin_member(1, [X, 2, 3]).
false.
I tried to include the case when the list might include free variables. So, I came up with this:
ymin_member(Min, Lst):-
member(Min, Lst),
maplist(#=<(Min), Lst).
Of course it's worse in terms of efficiency, but it works on that case:
?- ymin_member(1, [X, 2, 3]).
X = 1 ;
false.
?- ymin_member(X, [X, 2, 3]).
true ;
X = 2 ;
false.

Prolog Loops until True

I'm pretty new to Prolog but I'm trying to get this program to give me the first set of twin primes that appears either at or above N.
twins(M) :-
M2 is M + 2,
twin_prime(M, M2),
write(M),
write(' '),
write(M2).
M3 is M + 1,
twins(M3).
However, I'm not completely sure how to go about getting it to loop and repeat until it's true. I've tried using the repeat/0 predicate but I just get stuck in an infinite loop. Does anyone have any tips I could try? I'm pretty new to Prolog.
You're on the right track using tail recursion and #Jake Mitchell's solution works swell. But here are some tips that might help clarify a few basic concepts in Prolog:
First, it seems like your predicate twins/1 is actually defining a relationship between 2 numbers, namely, the two twin primes. Since Prolog is great for writing very clear, declarative, relational programs, you might make the predicate more precise and explicit by making it twin_primes/2. (That this should be a binary predicate is also pretty clear from your name for the predicate, since one thing cannot be twins...)
One nice bonus of explicitly working with a binary predicate when describing binary relations is that we no longer have to fuss with IO operations to display our results. We'll simply be able to query twin_primes(X,Y) and have the results returned as Prolog reports back on viable values of X and Y.
Second, and more importantly, your current definition of twins/1 wants to describe a disjunction: "twins(M) is true if M and M + 2 are both prime or if M3 is M + 3 and twins(M3) is true". The basic way of expressing disjunctions like this is by writing multiple clauses. A single clause of the form <Head> :- <Body> declares that the Head is true if all the statements composing the Body are true. Several clauses with the same head, like <Head> :- <Body1>. <Head> :- <Body2>. ..., declare that Head is true if Body1 is true or if Body2 is true. (Note that a series of clauses defining rules for a predicate are evaluated sequentially, from top to bottom. This is pretty important, since it introduces non-declarative elements into the foundations of our programs, and it can be exploited to achieve certain results.)
In fact, you are only a small step from declaring a second rule for twins/1. You just tried putting both clause-bodies under the same head instance. Prolog requires the redundant measure of declaring two different rules in cases like this. Your code should be fine (assuming your definition of twin_prime/2 works), if you just change it like so:
twins(M) :-
M2 is M + 2,
twin_prime(M, M2),
write(M),
write(' '),
write(M2).
twins(M) :-
\+twin_prime(M, M2), %% `\+` means "not"
M3 is M + 1,
twins(M3).
Note that if you take advantage of Prolog's back-tracking, you often don't actually need to effect loops through tail recursion. For example, here's an alternative approach, taking into account some of what I advised previously and using a quick (but not as in "efficient" or "fast") and dirty predicate for generating primes:
prime(2).
prime(P) :-
between(2,inf,P),
N is (P // 2 + 1),
forall(between(2,N,Divisor), \+(0 is P mod Divisor)).
twin_primes(P1, P2) :-
prime(P1),
P2 is P1 + 2,
prime(P2).
twin_primes/2 gets a prime number from prime/1, then calculates P2 and checks if it is prime. Since prime/1 will generate an infinite number of primes on backtracking, twin_primes/2 will just keep asking it for numbers until it finds a satisfactory solution. Note that, if called with two free variables, this twin_primes/2 will generate twin primes:
?- twin_primes(P1, P2).
P1 = 3,
P2 = 5 ;
P1 = 5,
P2 = 7 ;
P1 = 11,
P2 = 13 ;
P1 = 17,
P2 = 19 ;
P1 = 29,
P2 = 31 ;
But it will also verify if two numbers are twin primes if queried with specific values, or give you the twin of a prime, if it exists, if you give a value for P1 but leave P2 free:
?- twin_primes(3,Y). Y = 5.
There's a handy if-then-else operator that works well for this.
twin_prime(3,5).
twin_prime(5,7).
twin_prime(11,13).
next_twin(N) :-
A is N+1,
B is N+2,
(twin_prime(N,B) ->
write(N),
write(' '),
write(B)
;
next_twin(A)).
And a quick test:
?- next_twin(5).
5 7
true.
?- next_twin(6).
11 13
true.

DCG for idiomatic phrase preference

I have a manually made DCG rule to select idiomatic phrases
over single words. The DCG rule reads as follows:
seq(cons(X,Y), I, O) :- noun(X, I, H), seq(Y, H, O), \+ noun(_, I, O).
seq(X) --> noun(X).
The first clause is manually made, since (:-)/2 is used instead
of (-->)/2. Can I replace this manually made clause by
some clause that uses standard DCG?
Best Regards
P.S.: Here is some test data:
noun(n1) --> ['trojan'].
noun(n2) --> ['horse'].
noun(n3) --> ['trojan', 'horse'].
noun(n4) --> ['war'].
And here are some test cases, the important test case is the first test case, since it does only
deliver n3 and not cons(n1,n2). The behaviour of the first test case is what is especially desired:
?- phrase(seq(X),['trojan','horse']).
X = n3 ;
No
?- phrase(seq(X),['war','horse']).
X = cons(n4,n2) ;
No
?- phrase(seq(X),['trojan','war']).
X = cons(n1,n4) ;
No
(To avoid collisions with other non-terminals I renamed your seq//1 to nounseq//1)
Can I replace this manually made clause by some clause that uses standard DCG?
No, because it is not steadfast and it is STO (details below).
Intended meaning
But let me start with the intended meaning of your program. You say you want to select idiomatic phrases over single words. Is your program really doing this? Or, to put it differently, is your definition really unique? I could now construct a counterexample, but let Prolog do the thinking:
nouns --> [] | noun(_), nouns.
?- length(Ph, N), phrase(nouns,Ph),
dif(X,Y), phrase(nounseq(X),Ph), phrase(nounseq(Y),Ph).
Ph = [trojan,horse,trojan], N = 3, X = cons(n1,cons(n2,n1)), Y = cons(n3,n1)
; ...
; Ph = [trojan,horse,war], N = 3, X = cons(n3,n4), Y = cons(n1,cons(n2,n4))
; ... .
So your definition is ambiguous. What you essentially want (probably) is some kind of rewrite system. But those are rarely defined in a determinate manner. What, if two words overlap like an additional noun(n5) --> [horse, war]. etc.
Conformance
A disclaimer up-front: Currently, the DCG document is still being developed — and comments are very welcome! You find all material in this place. So strictly speaking, there is at the current point in time no notion of conformance for DCG.
Steadfastness
One central property a conforming definition must maintain is the property of steadfastness. So before looking into your definition, I will compare two goals of phrase/3 (running SWI in default mode).
?- Ph = [], phrase(nounseq(cons(n4,n4)),Ph0,Ph).
Ph = [], Ph0 = [war,war]
; false.
?- phrase(nounseq(cons(n4,n4)),Ph0,Ph), Ph = [].
false.
?- phrase(nounseq(cons(n4,n4)),Ph0,Ph).
false.
Moving the goal Ph = [] at the end, removes the only solution. Therefore, your definition is not steadfast. This is due to the way how you handle (\+)/1: The variable O must not occur within the (\+)/1. But on the other hand, if it does not occur within (\+)/1 you can only inspect the beginning of a sentence. And not the entire sentence.
Subject to occurs-check property
But the situation is worse:
?- set_prolog_flag(occurs_check,error).
true.
?- phrase(nounseq(cons(n4,n4)),Ph0,Ph).
ERROR: noun/3: Cannot unify _G968 with [war|_G968]: would create an infinite tree
So your program relies on STO-unifications (subject-to-occurs-check unifications) whose outcome is explicitly undefined in
ISO/IEC 13211-1 Subclause 7.3.3 Subject to occurs-check (STO) and not subject to occurs-check (NSTO)
This is rather due to your intention to define the intersection of two non-terminals. Consider the following way to express it:
:- op( 950, xfx, //\\). % ASCII approximation for ∩ - 2229;INTERSECTION
(NT1 //\\ NT2) -->
call(Xs0^Xs^(phrase(NT1,Xs0,Xs),phrase(NT2,Xs0,Xs))).
% The following is predefined in library(lambda):
^(V0, Goal, V0, V) :-
call(Goal,V).
^(V, Goal, V) :-
call(Goal).
Already with this definition we can get into STO situations:
?- phrase(([a]//\\[a,b]), Ph0,Ph).
ERROR: =/2: Cannot unify _G3449 with [b|_G3449]: would create an infinite tree
In fact, when using rational trees we get:
?- set_prolog_flag(occurs_check,false).
true.
?- phrase(([a]//\\[a,b]), Ph0,Ph).
Ph0 = [a|_S1], % where
_S1 = [b|_S1],
Ph = [b|_S1].
So there is an infinite list which certainly has not much meaning for natural language sentences (except for persons of infinite resource and capacity...).

Solve Cannibals/Missionaries using breadth-first search (BFS) in Prolog?

I am working on solving the classic Missionaries(M) and Cannibals(C) problem, the start state is 3 M and 3 C on the left bank and the goal state is 3M, 3C on the right bank. I have complete the basic function in my program and I need to implemet the search-strategy such as BFS and DFS.
Basically my code is learn from the Internet. So far I can successfuly run the program with DFS method, but I try to run with BFS it always return false. This is my very first SWI-Prolog program, I can not find where is the problem of my code.
Here is part of my code, hope you can help me find the problem of it
solve2 :-
bfs([[[3,3,left]]],[0,0,right],[[3,3,left]],Solution),
printSolution(Solution).
bfs([[[A,B,C]]],[A,B,C],_,[]).
bfs([[[A,B,C]|Visisted]|RestPaths],[D,E,F],Visisted,Moves) :-
findall([[I,J,K],[A,B,C]|Visited]),
(
move([A,B,C],[I,J,K],Description),
safe([I,J,K]),
not(member([I,J,K],Visited)
),
NewPaths
),
append(RestPaths,NewPaths,CurrentPaths),
bfs(CurrentPaths,[D,E,F],[[I,J,K]|Visisted],MoreMoves),
Moves = [ [[A,B,C],[I,J,K],Description] | MoreMoves ].
move([A,B,left],[A1,B,right],'One missionary cross river') :-
A > 0, A1 is A - 1.
% Go this state if left M > 0. New left M is M-1
.
.
.
.
.
safe([A,B,_]) :-
(B =< A ; A = 0),
A1 is 3-A, B1 is 3-B,
(B1 =< A1; A1 =0).
I use findall to find all possible path before go to next level. Only the one pass the safe() will be consider as possible next state. The state will not use if it already exist. Since my program can run with DFS so I think there is nothing wrong with move() and safe() predicate. My BFS predicate is changing base on my DFS code, but its not work.
There is a very simple way to turn a depth-first search program into a breadth-first one, provided the depth-first search is directly mapped to Prolog's search. This technique is called iterative deepening.
Simply add an additional argument to ensure that the search will only go N steps deep.
So a dfs-version:
dfs(State) :-
final(State).
dfs(State1) :-
state_transition(State1, State2),
dfs(State2).
Is transformed into a bfs by adding an argument for the depth. E.g. by using successor-arithmetics:
bfs(State, _) :-
final(State).
bfs(State1, s(X)) :-
state_transition(State1, State2),
bfs(State2, X).
A goal bfs(State,s(s(s(0)))) will now find all derivations requiring 3 or less steps. You still can perform dfs! Simply use bfs(State,X).
To find all derivations use natural_number(X), bfs(State,X).
Often it is useful to use a list instead of the s(X)-number. This list might contain all intermediary states or the particular transitions performed.
You might hesitate to use this technique, because it seems to recompute a lot of intermediary states ("repeatedly expanded states"). After all, first it searches all paths with one step, then, at most two steps, then, at most three steps... However, if your problem is a search problem, the branching factor here hidden within state_transition/2 will mitigate that overhead. To see this, consider a branching factor of 2: We only will have an overhead of a factor of two! Often, there are easy ways to regain that factor of two: E.g., by speeding up state_transition/2 or final/1.
But the biggest advantage is that it does not consume a lot of space - in contrast to naive dfs.
The Logtalk distribution includes an example, "searching", which implements a framework for state space searching:
https://github.com/LogtalkDotOrg/logtalk3/tree/master/examples/searching
The "classical" problems are included (farmer, missionaries and cannibals, puzzle 8, bridge, water jugs, etc). Some of the search algorithms are adapted (with permission) from Ivan Bratko's book "Prolog programming for artificial intelligence". The example also includes a performance monitor that can give you some basic stats on the performance of a search method (e.g. branching factors and number of state transitions). The framework is easy to extend, both for new problems and new search methods.
If anyone still interested in this for a python solution please find the following.
For the simplification, count of Missionaries and Cannibals on left is only taken to the consideration.
This is the solution tree.
#M #missionaries in left
#C #cannibals in left
# B=1left
# B=0right
def is_valid(state):
if(state[0]>3 or state[1]>3 or state[2]>1 or state[0]<0 or state[1]<0 or state[2]<0 or (0<state[0]<state[1]) or (0<(3-state[0])<(3-state[1]))):
return False
else:
return True
def generate_next_states(M,C,B):
moves = [[1, 0, 1], [0, 1, 1], [2, 0, 1], [0, 2, 1], [1, 1, 1]]
valid_states = []
for each in moves:
if(B==1):next_state = [x1 - x2 for (x1, x2) in zip([M, C, B], each)]
else:next_state = [x1 + x2 for (x1, x2) in zip([M, C, B], each)]
if (is_valid(next_state)):
# print(next_state)
valid_states.append(next_state)
return valid_states
solutions = []
def find_sol(M,C,B,visited):
if([M,C,B]==[0,0,0]):#everyne crossed successfully
# print("Solution reached, steps: ",visited+[[0,0,0]])
solutions.append(visited+[[0,0,0]])
return True
elif([M,C,B] in visited):#prevent looping
return False
else:
visited.append([M,C,B])
if(B==1):#boat is in left
for each_s in generate_next_states(M,C,B):
find_sol(each_s[0],each_s[1],each_s[2],visited[:])
else:#boat in in right
for each_s in generate_next_states(M,C,B):
find_sol(each_s[0],each_s[1],each_s[2],visited[:])
find_sol(3,3,1,[])
solutions.sort()
for each_sol in solutions:
print(each_sol)
Please refer to this gist to see a possible solution, maybe helpful to your problem.
Gist: solve Missionaries and cannibals in Prolog
I've solved with depth-first and then with breadth-first, attempting to clearly separate the reusable part from the state search algorithm:
miss_cann_dfs :-
initial(I),
solve_dfs(I, [I], Path),
maplist(writeln, Path), nl.
solve_dfs(S, RPath, Path) :-
final(S),
reverse(RPath, Path).
solve_dfs(S, SoFar, Path) :-
move(S, T),
\+ memberchk(T, SoFar),
solve_dfs(T, [T|SoFar], Path).
miss_cann_bfs :-
initial(I),
solve_bfs([[I]], Path),
maplist(writeln, Path), nl.
solve_bfs(Paths, Path) :-
extend(Paths, Extended),
( member(RPath, Extended),
RPath = [H|_],
final(H),
reverse(RPath, Path)
; solve_bfs(Extended, Path)
).
extend(Paths, Extended) :-
findall([Q,H|R],
( member([H|R], Paths),
move(H, Q),
\+ member(Q, R)
), Extended),
Extended \= [].
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% problem representation
% independent from search method
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
initial((3,3, >, 0,0)).
final((0,0, <, 3,3)).
% apply a *valid* move
move((M1i,C1i, Bi, M2i,C2i), (M1f,C1f, Bf, M2f,C2f)) :-
direction(Bi, F1, F2, Bf),
who_move(MM, CM),
M1f is M1i + MM * F1, M1f >= 0,
C1f is C1i + CM * F1, C1f >= 0,
( M1f >= C1f ; M1f == 0 ),
M2f is M2i + MM * F2, M2f >= 0,
C2f is C2i + CM * F2, C2f >= 0,
( M2f >= C2f ; M2f == 0 ).
direction(>, -1, +1, <).
direction(<, +1, -1, >).
% valid placements on boat
who_move(M, C) :-
M = 2, C = 0 ;
M = 1, C = 0 ;
M = 1, C = 1 ;
M = 0, C = 2 ;
M = 0, C = 1 .
I suggest you to structure your code in a similar way, with a predicate similar to extend/2, that make clear when to stop the search.
If your Prolog system has a forward chainer you can also solve
the problem by modelling it via forward chaining rules. Here
is an example how to solve a water jug problem in Jekejeke Minlog.
The state is represented by a predicate state/2.
You first give a rule that filters duplicates as follows. The
rule says that an incoming state/2 fact should be removed,
if it is already in the forward store:
% avoid duplicate state
unit &:- &- state(X,Y) && state(X,Y), !.
Then you give rules that state that search need not be continued
when a final state is reached. In the present example we check
that one of the vessels contains 1 liter of water:
% halt for final states
unit &:- state(_,1), !.
unit &:- state(1,_), !.
As a next step one models the state transitions as forward chaining
rules. This is straight forward. We model emptying, filling and pouring
of vessels:
% emptying a vessel
state(0,X) &:- state(_,X).
state(X,0) &:- state(X,_).
% filling a vessel
state(5,X) &:- state(_,X).
state(X,7) &:- state(X,_).
% pouring water from one vessel to the other vessel
state(Z,T) &:- state(X,Y), Z is min(5,X+Y), T is max(0,X+Y-5).
state(T,Z) &:- state(X,Y), Z is min(7,X+Y), T is max(0,X+Y-7).
We can now use the forward chaining engine to do the job for us. It
will not do iterative deeping and it will also not do breadth first.
It will just do unit resolution by a strategy that is greedy for the
given fact and the process only completes, since the state space
is finite. Here is the result:
?- post(state(0,0)), posted.
state(0, 0).
state(5, 0).
state(5, 7).
state(0, 7).
Etc..
The approach will tell you whether there is a solution, but not explain
the solution. One approach to make it explainable is to use a fact
state/4 instead of a fact state/2. The last two arguments are used for
a list of actions and for the length of the list.
The rule that avoids duplicates is then changed for a rule that picks
the smallest solution. It reads as follows:
% choose shorter path
unit &:- &- state(X,Y,_,N) && state(X,Y,_,M), M<N, !.
unit &:- state(X,Y,_,N) && &- state(X,Y,_,M), N<M.
We then get:
?- post(state(0,0,[],0)), posted.
state(0, 0, [], 0).
state(5, 0, [fl], 1).
state(5, 7, [fr,fl], 2).
state(0, 5, [plr,fl], 2).
Etc..
With a little helper predicate we can force an explanation of
the actions that lead to a path:
?- post(state(0,0,[],0)), state(1,7,L,_), explain(L).
0-0
fill left vessel
5-0
pour left vessel into right vessel
0-5
fill left vessel
5-5
pour left vessel into right vessel
3-7
empty right vessel
3-0
pour left vessel into right vessel
0-3
fill left vessel
5-3
pour left vessel into right vessel
1-7
Bye
Source Code: Water Jug State
http://www.xlog.ch/jekejeke/forward/jugs3.p
Source Code: Water Jug State and Path
http://www.xlog.ch/jekejeke/forward/jugs3path.p

Resources