Prolog order of clauses causes pathfinding to fail - prolog

I am developing a path finding algorithm in Prolog, giving all nodes accessible by a path from a starting node. To avoid duplicate paths, visited nodes are kept in a list.
Nodes and neighbors are defined as below:
node(a).
node(b).
node(c).
node(d).
node(e).
edge(a,b).
edge(b,c).
edge(c,d).
edge(b,d).
neighbor(X,Y) :- edge(X,Y).
neighbor(X,Y) :- edge(Y,X).
The original algorithm below works fine:
path2(X,Y) :-
pathHelper(X,Y,[X]).
pathHelper(X,Y,L) :-
neighbor(X,Y),
\+ member(Y,L).
pathHelper(X,Y,H) :-
neighbor(X,Z),
\+ member(Z,H),
pathHelper(Z,Y,[Z|H]).
This works fine
[debug] ?- path2(a,X).
X = b ;
X = c ;
X = d ;
X = d ;
X = c ;
false.
however, when changing the order of the two clauses in the second definition, such as below
pathHelper(X,Y,L) :-
\+ member(Y,L),
neighbor(X,Y).
When trying the same here, swipl returns the following:
[debug] ?- path2(a,X).
false.
The query doesn't work anymore, and only returns false. I have tried to understand this through the tracecommand, but still can't make sense of what exactly is wrong.
In other words, I am failing to understand why the order of neighbor(X,Y)and \+ member(Y,L)is crucial here. It makes sense to have neighbor(X,Y) first in terms of efficiency, but not in terms of correctness to me.

You are now encountering the not so clean-cut borders of pure Prolog and its illogical surroundings. Welcome to the real world.
Or rather, not welcome! Instead, let's try to improve your definition. The key problem is
\+ member(Y, [a]), Y = b.
which fails while
Y = b, \+ member(Y,[a]).
succeeds. There is no logic to justify this. It's just the operational mechanism of Prolog's built-in (\+)/1.
Happily, we can improve upon this. Enter non_member/2.
non_member(_X, []).
non_member(X, [E|Es]) :-
dif(X, E),
non_member(X, Es).
Now,
?- non_member(Y, [a]).
dif(Y,a).
Mark this answer, it says: Yes, Y is not an element of [a], provided Y is different from a. Think of the many solutions this answer includes, like Y = 42, or Y = b and infinitely many more such solutions that are not a. Infinitely many solutions captured in nine characters!
Now, both non_member(Y, [a]), Y = b and Y = b, non_member(Y, [a]) succeed. So exchanging them has only influence on runtime and space consumption. If we are at it, note that you check for non-memberness in two clauses. You can factor this out. For a generic solution to this, see closure/3. With it, you simply say: closure(neighbor, A,B).
Also consider the case where you have only edge(a,a). Your definition fails here for path2(a,X). But shouldn't this rather succeed?
And the name path2/2 is not that fitting, rather reserve this word for an actual path.

The doubt you have is related to how prolog handle negation. Prolog uses negation as failure. This means that, if prolog has to negate a goal g (indicate it with not(g)), it tries to prove g by executing it and then, if the g fails, not(g) (or \+ g, i.e. the negation of g) succeeds and viceversa.
Keep in mind also that, after the execution of not(g), if the goal has variables, they will not be instantiated. This because prolog should instantiate the variables with all the terms that makes g fail, and this is likely an infinite set (for example for a list, not(member(A,[a]) should instantiate the variable A with all the elements that are not in the list).
Let's see an example. Consider this simple program:
test:-
L = [a,b,c],
\+member(A,L),
writeln(A).
and run it with ?- trace, test. First of all you get a Singleton variable in \+: A warning for the reason i explained before, but let's ignore it and see what happens.
Call:test
Call:_5204=[a, b]
Exit:[a, b]=[a, b]
Call:lists:member(_5204, [a, b])
Exit:lists:member(a, [a, b]) % <-----
Fail:test
false
You see at the highlighted line that the variable A is instantiated to a and so member/2 succeeds and so \+ member(A,L) is false.
So, in your code, if you write pathHelper(X,Y,L) :- \+ member(Y,L), neighbor(X,Y)., this clause will always fail because Y is not sufficiently instantiated. If you swap the two terms, Y will be ground and so member/2 can fail (and \+member succeeds).

Related

CLPFD ins operator yields not sufficiently instantiated error

So, my goal is to make a map colourer in Prolog. Here's the map I'm using:
And this are my colouring constraints:
colouring([A,B,C,D,E,F]) :-
maplist( #\=(A), [B,C,D,E] ),
maplist( #\=(B), [C,D,F]),
C #\= D,
maplist( #\=(D), [E,F]),
E #\= F.
Where [A,B,C,D,E,F] is a list of numbers(colors) from 1 to n.
So I want my solver to, given a List of 6 colors and a natural number N, determine the colors and N constraints both ways, in a way that even the most general query could yield results:
regions_ncolors(L,N) :- colouring(L), L ins 1..N, label(L).
Where the most general query is regions_ncolors(L,N).
However, the operator ins doesn't seem to accept a variable N, it instead yields an argument not sufficiently instantiated error. I've tried using this solution instead:
int_cset_(N,Acc,Acc) :- N #= 0.
int_cset_(N,Acc,Cs) :- N_1 #= N-1, int_cset_(N_1,[N|Acc],Cs).
int_cset(N,Cs) :- int_cset_(N,[],Cs).
% most general solver
regions_ncolors(L,N) :- colouring(L), int_cset(N,Cs), subset(L,Cs), label(L).
Where the arguments in int_cset(N,Cs) is a natural number(N) and the counting set Sn = {1,2,...,N}
But this solution is buggy as regions_ncolors(L,N). only returns the same(one) solution for all N, and when I try to add a constraint to N, it goes in an infinite loop.
So what can I do to make the most general query work both ways(for not-instantiated variables)?
Thanks in advance!
Btw, I added a swi-prolog tag in my last post although it was removed by moderation. I don't know if this question is specific to swi-prolog which is why I'm keeping the tag, just in case :)
Your colouring is too specific, you encode the topology of your map into it. Not a problem as is, but it defeats of the purpose of then having a "most general query" solution for just any list.
If you want to avoid the problem of having a free variable instead of a list, you could first instantiate the list with length/2. Compare:
?- L ins 1..3.
ERROR: Arguments are not sufficiently instantiated
ERROR: In:
ERROR: [16] throw(error(instantiation_error,_86828))
ERROR: [10] clpfd:(_86858 ins 1..3) ...
Is that the same problem as you see?
If you first make a list and a corresponding set:
?- length(L, N), L ins 1..N.
L = [],
N = 0 ;
L = [1],
N = 1 ;
L = [_A, _B],
N = 2,
_A in 1..2,
_B in 1..2 ;
L = [_A, _B, _C],
N = 3,
_A in 1..3,
_B in 1..3,
_C in 1..3 .
If you use length/2 like this you will enumerate the possible lists and integer sets completely outside of the CLP(FD) labeling. You can then add more constraints on the variables on the list and if necessary, use labeling.
Does that help you get any further with your problem? I am not sure how it helps for the colouring problem. You would need a different representation of the map topology so that you don't have to manually define it within a predicate like your colouring/1 you have in your question.
There are several issues in your program.
subset/2 is impure
SWI's (by default) built-in predicate subset/2 is not the pure relation you are hoping for. Instead, it expects that both arguments are already sufficiently instantiated. And if not, it takes a guess and sticks to it:
?- colouring(L), subset(L,[1,2,3,4,5]).
L = [1,2,3,4,2,1].
?- colouring(L), subset(L,[1,2,3,4,5]), L = [2|_].
false.
?- L = [2|_], colouring(L), subset(L,[1,2,3,4,5]), L = [2|_].
L = [2,1,3,4,1,2].
With a pure definition it is impossible that adding a further goal as L = [2|_] in the third query makes a failing query succeed.
In general it is a good idea to not interfere with labeling/2 except for the order of variables and the options argument. The internal implementation is often much faster than manual instantiations.
Also, your map is far too simple to expose subset/2s weakness. Not sure what the minimal failing graph is, but here is one such example from
R. Janczewski et al. The smallest hard-to-color graph for algorithm DSATUR, Discrete Mathematics 236 (2001) p.164.
colouring_m13([K1,K2,K3,K6,K5,K7,K4]):-
maplist(#\=(K1), [K2,K3,K4,K7]),
maplist(#\=(K2), [K3,K5,K6]),
maplist(#\=(K3), [K4,K5]),
maplist(#\=(K4), [K5,K7]),
maplist(#\=(K5), [K6,K7]),
maplist(#\=(K6), [K7]).
?- colouring_m13(L), subset(L,[1,2,3,4]).
false. % incomplete
?- L = [3|_], colouring_m13(L), subset(L,[1,2,3,4]).
L = [3,1,2,2,3,1,4].
int_cset/2 never terminates
... (except for some error cases like int_cset(non_integer, _).). As an example consider:
?- int_cset(1,Cs).
Cs = [1]
; loops.
And don't get fooled by the fact that an actual solution was found! It still does not terminate.
#Luis: But how come? I'm getting baffled by this, the same thing is happening on ...
To see this, you need the notion of a failure-slice which helps to identify the responsible part in your program. With some falsework consisting of goals false the responsible part is exposed.
All unnecessary parts have been removed by false. What remains has to be changed somehow.
int_cset_(N,Acc,Acc) :- false, N #= 0.
int_cset_(N,Acc,Cs) :- N1 #= N-1, int_cset_(N1,[N|Acc],Cs), false.
int_cset(N,Cs) :- int_cset_(N,[],Cs), false.
?- int_cset(1, Cs), false.
loops.
Adding the redundant goal N1 #> 0
will avoid unnecessary non-termination.
This alone will not solve your problem since if N is not given, you will still encounter non-termination due to the following failure slice:
regions_ncolors(L,N) :-
colouring(L),
int_cset(N,Cs), false,
subset(L,Cs),
label(L).
In int_cset(N,Cs), Cs occurs for the first time and thus cannot influence termination (there is another reason too, its definition would ignore it as well..) and therefore only N has a chance to induce termination.
The actual solution has been already suggested by #TA_intern using length/2 which liberates one of such mode-infested chores.

Computer Reasoning about Prologish Boolos' Curious Inference

Boolo's curious inference has been originally formulated with equations here. It is a recursive definition of a function f and a predicate d via the syntax of N+, the natural numbers without zero, generated from 1 and s(.).
But it can also be formulated with Horn Clauses. The logical content is not exactly the same, the predicate f captures only the positive aspect of the function, but the problem type is the same. Take the following Prolog program:
f(_, 1, s(1)).
f(1, s(X), s(s(Y))) :- f(1, X, Y).
f(s(X), s(Y), T) :- f(s(X), Y, Z), f(X, Z, T).
d(1).
d(s(X)) :- d(X).
Whats the theoretical logical outcome of the last query, and can you demonstrably have a computer program in our time and space that produces the outcome, i.e. post the program on gist and everybody can run it?
?- f(X,X,Y).
X = 1,
Y = s(1)
X = s(1),
Y = s(s(s(1)))
X = s(s(1)),
Y = s(s(s(s(s(s(s(s(s(s(...))))))))))
ERROR: Out of global stack
?- f(s(s(s(s(1)))), s(s(s(s(1)))), X), d(X).
If the program that does the job of certifying the result is not a Prolog interpreter itself like here, what would do the job especially suited for this Prologish problem formulation?
One solution: Abstract interpretation
Preliminaries
In this answer, I use an interpreter to show that this holds. However, it is not a Prolog interpreter, because it does not interpret the program in exactly the same way Prolog interprets the program.
Instead, it interprets the program in a more abstract way. Such interpreters are therefore called abstract interpreters.
Program representation
Critically, I work directly with the source program, using only modifications that we, by purely algebraic reasoning, know can be safely applied. It helps tremendously for such reasoning that your source program is completely pure by construction, since it only uses pure predicates.
To simplify reasoning about the program, I now make all unifications explicit. It is easy to see that this does not change the meaning of the program, and can be easily automated. I obtain:
f(_, X, Y) :-
X = 1,
Y = s(1).
f(Arg, X, Y) :-
Arg = 1,
X = s(X0),
Y = s(s(Y0)),
f(Arg, X0, Y0).
f(X, Y, T) :-
X = s(X0),
Y = s(Y0),
f(X, Y0, Z),
f(X0, Z, T).
I leave it as an easy exercise to show that this is declaratively equivalent to the original program.
The abstraction
The abstraction I use is the following: Instead of reasoning over the concrete terms 1, s(1), s(s(1)) etc., I use the atom d for each term T for which I can prove that d(T) holds.
Let me show you what I mean by the following interpretation of unification:
interpret(d = N) :- d(N).
This says:
If d(N) holds, then N is to be regarded identical to the atom d, which, as we said, shall denote any term for which d/1 holds.
Note that this differs significantly from what an actual unification between concrete terms d and N means! For example, we obtain:
?- interpret(X = s(s(1))).
X = d.
Pretty strange, but I hope you can get used to it.
Extending the abstraction
Of course, interpreting a single unification is not enough to reason about this program, since it also contains additional language elements.
I therefore extend the abstract interpretation to:
conjunction
calls of f/3.
Interpreting conjunctions is easy, but what about f/3?
Incremental derivations
If, during abstract interpretation, we encounter the goal f(X, Y, Z), then we know the following: In principle, the arguments can of course be unified with any terms for which the goal succeeds. So we keep track of those arguments for which we know the query can succeed in principle.
We thus equip the predicate with an additional argument: A list of f/3 goals that are logical consequences of the program.
In addition, we implement the following very important provision: If we encounter a unification that cannot be safely interpreted in abstract terms, then we throw an error instead of failing silently. This may for example happen if the unification would fail when regarded as an abstract interpretation although it would succeed as a concrete unification, or if we cannot fully determine whether the arguments are of the intended domain. The primary purpose of this provision is to avoid unintentional elimination of actual solutions due to oversights in the abstract interpreter. This is the most critical aspect in the interpreter, and any proof-theoretic mechanism will face closely related questions (how can we ensure that no proofs are missed?).
Here it is:
interpret(Var = N, _) :-
must_be(var, Var),
must_be(ground, N),
d(N),
Var = d.
interpret((A,B), Ds) :-
interpret(A, Ds),
interpret(B, Ds).
interpret(f(A, B, C), Ds) :-
member(f(A, B, C), Ds).
Quis custodiet ipsos custodes?
How can we tell whether this is actually correct? That's the tough part! In fact, it turns out that the above is not sufficient to be certain to catch all cases, because it may simply fail if d(N) does not hold. It is obviously not acceptable for the abstract interpreter to fail silently for cases it cannot handle. So we need at least one more clause:
interpret(Var = N, _) :-
must_be(var, Var),
must_be(ground, N),
\+ d(N),
domain_error(d, N).
In fact, an abstract interpreter becomes a lot less error-prone when we reason about ground terms, and so I will use the atom any to represent "any term at all" in derived answers.
Over this domain, the interpretation of unification becomes:
interpret(Var = N, _) :-
must_be(ground, N),
( var(Var) ->
( d(N) -> Var = d
; N = s(d) -> Var = d
; N = s(s(d)) -> Var = d
; domain_error(d, N)
)
; Var == any -> true
; domain_error(any, Var)
).
In addition, I have implemented further cases of the unification over this abstract domain. I leave it as an exercise to ponder whether this correctly models the intended semantics, and to implement further cases.
As it will turn out, this definition suffices to answer the posted question. However, it clearly leaves a lot to be desired: It is more complex than we would like, and it becomes increasingly harder to tell whether we have covered all cases. Note though that any proof-theoretic approach will face closely corresponding issues: The more complex and powerful it becomes, the harder it is to tell whether it is still correct.
All derivations: See you at the fixpoint!
It now remains to deduce everything that follows from the original program.
Here it is, a simple fixpoint computation:
derivables(Ds) :-
functor(Head, f, 3),
findall(Head-Body, clause(Head, Body), Clauses),
derivables_fixpoint(Clauses, [], Ds).
derivables_fixpoint(Clauses, Ds0, Ds) :-
findall(D, clauses_derivable(Clauses, Ds0, D), Ds1, Ds0),
term_variables(Ds1, Vs),
maplist(=(any), Vs),
sort(Ds1, Ds2),
( same_length(Ds2, Ds0) -> Ds = Ds0
; derivables_fixpoint(Clauses, Ds2, Ds)
).
clauses_derivable(Clauses, Ds0, Head) :-
member(Head-Body, Clauses),
interpret(Body, Ds0).
Since we are deriving ground terms, sort/2 removes duplicates.
Example query:
?- derivables(Ds).
ERROR: Arguments are not sufficiently instantiated
Somewhat anticlimactically, the abstract interpreter is unable to process this program!
Commutativity to the rescue
In a proof-theoretic approach, we search for, well, proofs. In an interpreter-based approach, we can either improve the interpreter or apply algebraic laws to transform the source program in a way that preserves essential properties.
In this case, I will do the latter, and leave the former as an exercise. Instead of searching for proofs, we are searching for equivalent ways to write the program so that our interpreter can derive the desired properties. For example, I now use commutativity of conjunction to obtain:
f(_, X, Y) :-
X = 1,
Y = s(1).
f(Arg, X, Y) :-
Arg = 1,
f(Arg, X0, Y0),
X = s(X0),
Y = s(s(Y0)).
f(X, Y, T) :-
f(X, Y0, Z),
f(X0, Z, T),
X = s(X0),
Y = s(Y0).
Again, I leave it as an exercise to carefully check that this program is declaratively equivalent to your original program.
iamque opus exegi, because:
?- derivables(Ds).
Ds = [f(any, d, d)].
This shows that in each solution of f/3, the last two arguments are always terms for which d/1 holds! In particular, it also holds for the sample arguments you posted, even if there is no hope to ever actually compute the concrete terms!
Conclusion
By abstract interpretation, we have shown:
for all X where f(_, _, X) holds, d(X) also holds
beyond that, for all Y where f(_, Y, _) holds, d(Y) also holds.
The question only asked for a special case of the first property. We have shown significantly more!
In summary:
If f(_, Y, X) holds, then d(X) holds and d(Y) holds.
Prolog makes it comparatively easy and convenient to reason about Prolog programs. This often allows us to derive interesting properties of Prolog programs, such as termination properties and type information.
Please see Reasoning about programs for references and more explanation.
+1 for a great question and reference.

Steadfastness: Definition and its relation to logical purity and termination

So far, I have always taken steadfastness in Prolog programs to mean:
If, for a query Q, there is a subterm S, such that there is a term T that makes ?- S=T, Q. succeed although ?- Q, S=T. fails, then one of the predicates invoked by Q is not steadfast.
Intuitively, I thus took steadfastness to mean that we cannot use instantiations to "trick" a predicate into giving solutions that are otherwise not only never given, but rejected. Note the difference for nonterminating programs!
In particular, at least to me, logical-purity always implied steadfastness.
Example. To better understand the notion of steadfastness, consider an almost classical counterexample of this property that is frequently cited when introducing advanced students to operational aspects of Prolog, using a wrong definition of a relation between two integers and their maximum:
integer_integer_maximum(X, Y, Y) :-
Y >= X,
!.
integer_integer_maximum(X, _, X).
A glaring mistake in this—shall we say "wavering"—definition is, of course, that the following query incorrectly succeeds:
?- M = 0, integer_integer_maximum(0, 1, M).
M = 0. % wrong!
whereas exchanging the goals yields the correct answer:
?- integer_integer_maximum(0, 1, M), M = 0.
false.
A good solution of this problem is to rely on pure methods to describe the relation, using for example:
integer_integer_maximum(X, Y, M) :-
M #= max(X, Y).
This works correctly in both cases, and can even be used in more situations:
?- integer_integer_maximum(0, 1, M), M = 0.
false.
?- M = 0, integer_integer_maximum(0, 1, M).
false.
| ?- X in 0..2, Y in 3..4, integer_integer_maximum(X, Y, M).
X in 0..2,
Y in 3..4,
M in 3..4 ? ;
no
Now the paper Coding Guidelines for Prolog by Covington et al., co-authored by the very inventor of the notion, Richard O'Keefe, contains the following section:
5.1 Predicates must be steadfast.
Any decent predicate must be “steadfast,” i.e., must work correctly if its output variable already happens to be instantiated to the output value (O’Keefe 1990).
That is,
?- foo(X), X = x.
and
?- foo(x).
must succeed under exactly the same conditions and have the same side effects.
Failure to do so is only tolerable for auxiliary predicates whose call patterns are
strongly constrained by the main predicates.
Thus, the definition given in the cited paper is considerably stricter than what I stated above.
For example, consider the pure Prolog program:
nat(s(X)) :- nat(X).
nat(0).
Now we are in the following situation:
?- nat(0).
true.
?- nat(X), X = 0.
nontermination
This clearly violates the property of succeeding under exactly the same conditions, because one of the queries no longer succeeds at all.
Hence my question: Should we call the above program not steadfast? Please justify your answer with an explanation of the intention behind steadfastness and its definition in the available literature, its relation to logical-purity as well as relevant termination notions.
In 'The craft of prolog' page 96 Richard O'Keef says 'we call the property of refusing to give wrong answers even when the query has an unexpected form (typically supplying values for what we normally think of as inputs*) steadfastness'
*I am not sure if this should be outputs. i.e. in your query ?- M = 0, integer_integer_maximum(0, 1, M). M = 0. % wrong! M is used as an input but the clause has been designed for it to be an output.
In nat(X), X = 0. we are using X as an output variable not an input variable, but it has not given a wrong answer, as it does not give any answer. So I think under that definition it could be steadfast.
A rule of thumb he gives is 'postpone output unification until after the cut.' Here we have not got a cut, but we still want to postpone the unification.
However I would of thought it would be sensible to have the base case first rather than the recursive case, so that nat(X), X = 0. would initially succeed .. but you would still have other problems..

Collect all "minimum" solutions from a predicate

Given the following facts in a database:
foo(a, 3).
foo(b, 2).
foo(c, 4).
foo(d, 3).
foo(e, 2).
foo(f, 6).
foo(g, 3).
foo(h, 2).
I want to collect all first arguments that have the smallest second argument, plus the value of the second argument. First try:
find_min_1(Min, As) :-
setof(B-A, foo(A, B), [Min-_|_]),
findall(A, foo(A, Min), As).
?- find_min_1(Min, As).
Min = 2,
As = [b, e, h].
Instead of setof/3, I could use aggregate/3:
find_min_2(Min, As) :-
aggregate(min(B), A^foo(A, B), Min),
findall(A, foo(A, Min), As).
?- find_min_2(Min, As).
Min = 2,
As = [b, e, h].
NB
This only gives the same results if I am looking for the minimum of a number. If an arithmetic expression in involved, the results might be different. If a non-number is involved, aggregate(min(...), ...) will throw an error!
Or, instead, I can use the full key-sorted list:
find_min_3(Min, As) :-
setof(B-A, foo(A, B), [Min-First|Rest]),
min_prefix([Min-First|Rest], Min, As).
min_prefix([Min-First|Rest], Min, [First|As]) :-
!,
min_prefix(Rest, Min, As).
min_prefix(_, _, []).
?- find_min_3(Min, As).
Min = 2,
As = [b, e, h].
Finally, to the question(s):
Can I do this directly with library(aggregate)? It feels like it should be possible....
Or is there a predicate like std::partition_point from the C++ standard library?
Or is there some easier way to do this?
EDIT:
To be more descriptive. Say there was a (library) predicate partition_point/4:
partition_point(Pred_1, List, Before, After) :-
partition_point_1(List, Pred_1, Before, After).
partition_point_1([], _, [], []).
partition_point_1([H|T], Pred_1, Before, After) :-
( call(Pred_1, H)
-> Before = [H|B],
partition_point_1(T, Pred_1, B, After)
; Before = [],
After = [H|T]
).
(I don't like the name but we can live with it for now)
Then:
find_min_4(Min, As) :-
setof(B-A, foo(A, B), [Min-X|Rest]),
partition_point(is_min(Min), [Min-X|Rest], Min_pairs, _),
pairs_values(Min_pairs, As).
is_min(Min, Min-_).
?- find_min_4(Min, As).
Min = 2,
As = [b, e, h].
What is the idiomatic approach to this class of problems?
Is there a way to simplify the problem?
Many of the following remarks could be added to many programs here on SO.
Imperative names
Every time, you write an imperative name for something that is a relation you will reduce your understanding of relations. Not much, just a little bit. Many common Prolog idioms like append/3 do not set a good example. Think of append(As,As,AsAs). The first argument of find_min(Min, As) is the minimum. So minimum_with_nodes/2 might be a better name.
findall/3
Do not use findall/3 unless the uses are rigorously checked, essentially everything must be ground. In your case it happens to work. But once you generalize foo/2 a bit, you will lose. And that is frequently a problem: You write a tiny program ; and it seems to work.
Once you move to bigger ones, the same approach no longer works. findall/3 is (compared to setof/3) like a bull in a china shop smashing the fine fabric of shared variables and quantification. Another problem is that accidental failure does not lead to failure of findall/3 which often leads to bizarre, hard to imagine corner cases.
Untestable, too specific program
Another problem is somewhat related to findall/3, too. Your program is so specific, that it is quite improbable that you will ever test it. And marginal changes will invalidate your tests. So you will soon give up to perform testing. Let's see what is specific: Primarily the foo/2 relation. Yes, only an example. Think of how to set up a test configuration where foo/2 may change. After each change (writing a new file) you will have to reload the program. This is so complex, chances are you will never do it. I presume you do not have a test harness for that. Plunit for one, does not cover such testing.
As a rule of thumb: If you cannot test a predicate on the top level you never will. Consider instead
minimum_with(Rel_2, Min, Els)
With such a relation, you can now have a generalized xfoo/3 with an additional parameter, say:
xfoo(o, A,B) :-
foo(A,B).
xfoo(n, A,B) :-
newfoo(A,B).
and you most naturally get two answers for minimum_with(xfoo(X), Min, Els). Would you have used findall/3 instead of setof/3 you already would have serious problems. Or just in general: minmum_with(\A^B^member(A-B, [x-10,y-20]), Min, Els). So you can play around on the top level and produce lots of interesting test cases.
Unchecked border cases
Your version 3 is clearly my preferred approach, however there are still some parts that can be improved. In particular, if there are answers that contain variables as a minimum. These should be checked.
And certainly, also setof/3 has its limits. And ideally you would test them. Answers should not contain constraints, in particular not in the relevant variables. This shows how setof/3 itself has certain limits. After the pioneering phase, SICStus produced many errors for constraints in such cases (mid 1990s), later changed to consequently ignoring constraints in built-ins that cannot handle them. SWI on the other hand does entirely undefined things here. Sometimes things are copied, sometimes not. As an example take:
setof(A, ( A in 1..3 ; A in 3..5 ), _) and setof(t, ( A in 1..3 ; A in 3.. 5 ), _).
By wrapping the goal this can be avoided.
call_unconstrained(Goal_0) :-
call_residue_vars(Goal_0, Vs),
( Vs = [] -> true ; throw(error(representation_error(constraint),_)) ).
Beware, however, that SWI has spurious constraints:
?- call_residue_vars(all_different([]), Xs).
Xs = [_A].
Not clear if this is a feature in the meantime. It has been there since the introduction of call_residue_vars/2 about 5 years ago.
I don't think that library(aggregate) covers your use case. aggregate(min) allows for one witness:
min(Expr, Witness)
A term min(Min, Witness), where Min is the minimal version of Expr over all solutions, and Witness is any other template applied to solutions that produced Min. If multiple solutions provide the same minimum, Witness corresponds to the first solution.
Some time ago, I wrote a small 'library', lag.pl, with predicates to aggregate with low overhead - hence the name (LAG = Linear AGgregate). I've added a snippet, that handles your use case:
integrate(min_list_associated, Goal, Min-Ws) :-
State = term(_, [], _),
forall(call(Goal, V, W), % W stands for witness
( arg(1, State, C), % C is current min
arg(2, State, CW), % CW are current min witnesses
( ( var(C) ; V #< C )
-> U = V, Ws = [W]
; U = C,
( C == V
-> Ws = [W|CW]
; Ws = CW
)
),
nb_setarg(1, State, U),
nb_setarg(2, State, Ws)
)),
arg(1, State, Min), arg(2, State, Ws).
It's a simple minded extension of integrate(min)...
The comparison method it's surely questionable (it uses less general operator for equality), could be worth to adopt instead a conventional call like that adopted for predsort/3. Efficiency wise, still better would be to encode the comparison method as option in the 'function selector' (min_list_associated in this case)
edit thanks #false and #Boris for correcting the bug relative to the state representation. Calling nb_setarg(2, State, Ws) actually changes the term' shape, when State = (_,[],_) was used. Will update the github repo accordingly...
Using library(pairs) and [sort/4], this can be simply written as:
?- bagof(B-A, foo(A, B), Ps),
sort(1, #=<, Ps, Ss), % or keysort(Ps, Ss)
group_pairs_by_key(Ss, [Min-As|_]).
Min = 2,
As = [b, e, h].
This call to sort/4 can be replaced with keysort/2, but with sort/4 one can also find for example the first arguments associated with the largest second argument: just use #>= as the second argument.
This solution is probably not as time and space efficient as the other ones, but may be easier to grok.
But there is another way to do it altogether:
?- bagof(A, ( foo(A, Min), \+ ( foo(_, Y), Y #< Min ) ), As).
Min = 2,
As = [b, e, h].

Trying to count steps through recursion?

This is a cube, the edges of which are directional; It can only go left to right, back to front and top to bottom.
edge(a,b).
edge(a,c).
edge(a,e).
edge(b,d).
edge(b,f).
edge(c,d).
edge(c,g).
edge(d,h).
edge(e,f).
edge(e,g).
edge(f,h).
edge(g,h).
With the method below we can check if we can go from A-H for example: cango(A,H).
move(X,Y):- edge(X,Y).
move(X,Y):- edge(X,Z), move(Z,Y).
With move2, I'm trying to impalement counting of steps required.
move2(X,Y,N):- N is N+1, edge(X,Y).
move2(X,Y,N):- N is N+1, edge(X,Z), move2(Z,Y,N).
How would I implement this?
arithmetic evaluation is carried out as usual in Prolog, but assignment doesn't work as usual. Then you need to introduce a new variable to increment value:
move2(X,Y,N,T):- T is N+1, edge(X,Y).
move2(X,Y,N,T):- M is N+1, edge(X,Z), move2(Z,Y,M,T).
and initialize N to 0 at first call. Such added variables (T in our case) are often called accumulators.
move2(X,Y,1):- edge(X,Y), ! .
move2(X,Y,NN):- edge(X,Z), move2(Z,Y,N), NN is N+1 .
(is)/2 is very sensitive to instantiations in its second argument. That means that you cannot use it in an entirely relational manner. You can ask X is 1+1., you can even ask 2 is 1+1. but you cannot ask: 2 is X+1.
So when you are programming with predicates like (is)/2, you have to imagine what modes a predicate will be used with. Such considerations easily lead to errors, in particular, if you just started. But don't worry, also more proficient programmers still fall prey to such problems.
There is a clean alternative in several Prolog systems: In SICStus, YAP, SWI there is a library(clpfd) which permits you to express relations between integers. Usually this library is used for constraint programming, but you can also use it as a safe and clean replacement for (is)/2 on the integers. Even more so, this library is often very efficiently compiled such that the resulting code is comparable in speed to (is)/2.
?- use_module(library(clpfd)).
true.
?- X #= 1+1.
X = 2.
?- 2 #= 1+1.
true.
?- 2 #= X+1.
X = 1.
So now back to your program, you can simply write:
move2(X,Y,1):- edge(X,Y).
move2(X,Y,N0):- N0 #>= 1, N0 #= N1+1, edge(X,Z), move2(Z,Y,N1).
You get now all distances as required.
But there is more to it ...
To make sure that move2/3 actually terminates, try:
?- move2(A, B, N), false.
false.
Now we can be sure that move2/3 always terminates. Always?
Assume you have added a further edge:
edge(f, f).
Now above query loops. But still you can use your program to your advantage!
Determine the number of nodes:
?- setof(C,A^B^(edge(A,B),member(C,[A,B])),Cs), length(Cs, N).
Cs = [a, b, c, d, e, f, g, h], N = 8.
So the longest path will take just 7 steps!
Now you can ask the query again, but now by constraining N to a value less than or equal to7:
?- 7 #>= N, move2(A,B, N), false.
false.
With this additional constraint, you have again a terminating definition! No more loops.

Resources