Optimising a Prolog Program (remove duplicates, repeat recalculation)

Optimising a Prolog Program (remove duplicates, repeat recalculation) - prolog

I am very inexperienced with Prolog. I have a data set that contains elements and relations in graph that has circularity (quite a lot). There are rules to calculate the summary relation of a path. One of these is: one takes the path, then takes the weakest relation, and that is the one that holds between both ends.
With
Elements E1, E2, E3 and
Relations R1/R1c, R2, R3 (strength low to high) and
structure E1-R3-E2, E1-R1-E2, E2-R2-E3, E3-R1-E2
I can make the following minimal example:
% weaker table
isWeaker( r1, r2).
isWeaker( r2, r3).
weaker( X, Y) :- isWeaker( X, Y).
weaker( X, Y) :-
isWeaker( X, Z),
weaker( Z, Y).
% 'weakest' is <= not <
weakest( X, X, Y) :- =(X,Y).
weakest( X, X, Y) :- weaker( X, Y).
% All direct relations
isADirectRelation( e1, r1, e2).
isADirectRelation( e1, r3, e2).
isADirectRelation( e2, r2, e3).
isADirectRelation( e3, r1, e2).
isADirectRelation( e1, r3, e4).
isADirectRelation( e4, r2, e3).
isADirectRelation( e1, r1, e4).
isADirectRelation( e3, r1, e4).
% derived relations calculations
% Structural Chains
isADerivedRelation( Source, Relation, Target, Visited) :-
\+ member( [Source,Relation,Target], Visited),
weakest( Relation, RelationOne, RelationTwo),
isARelation( Source, RelationOne, Intermediate, [[Source,Relation,Target]|Visited]),
isARelation( Intermediate, RelationTwo, Target, [[Source,Relation,Target]|Visited]).
% major workhorse with anti-circularity
isARelation( Source, Relation, Target, Visited) :-
(isADirectRelation( Source, Relation, Target);
isADerivedRelation( Source, Relation, Target, Visited)).
The result of isARelation( Source, Relation, Target, []). is
e1,r1,e2
e1,r3,e2
e2,r2,e3
e3,r1,e2
e1,r3,e4
e4,r2,e3
e1,r1,e4
e3,r1,e4
e1,r1,e3
e3,r1,e3
e1,r1,e3 duplicate
e3,r1,e3 duplicate
Missing are
e4,r1,e4
e2,r2,e2
Is it at all possible to solve this in Prolog? Formally, yes, of course, but also with a decent performance?

There are many things to be said about this question, so this will be a long and rambling and ultimately unsatisfactory answer. So I might as well start with a pet peeve: Please don't use camelCaseIdentifiers for predicates, we usually use underscore_separated_words instead. I'm not sure why this bugs me in Prolog in particular, but I suspect partly because uppercase letters are syntactically significant.
Moving on, your weakest/3 predicate is broken:
?- weakest(Weakest, r2, r1).
false.
I think you had this right in the first version of your question, but then you removed the third clause of weakest/3 because you thought it caused redundant answers. Needless to say, "efficiency" is useless without correctness. (Also, we usually put the "output" arguments last, not first.)
Part of the reason you get redundant answers is your use of two (indirectly) recursive calls to isARelation/4 in isADerivedRelation/4. What you are computing is something like the transitive closure of the union of "direct" relations. The usual way to express the transitive closure in Prolog is like this:
transitive_closure(A, B) :-
base_relation(A, B).
transitive_closure(A, C) :-
base_relation(A, B),
transitive_closure(B, C).
That is, we first "take a base step", then recurse. If our base relation has pairs a-b, b-c, c-d, then this will find the solution a-d exactly once, as the composition of the base pair a-b and the derived transitive pair b-d. In contrast, if we were to structure the second clause as you did, with two recursive calls to transitive_closure/2, we would get the solution a-d twice: Once as above, but also once because we would derive the transitive pair a-c and compose it with c-d to give a-d.
You can fix this by changing your first isARelation/4 call in isADerivedRelation/4 into a call to isADirectRelation/3.
Another problem is that you are using Visited wrong: You are marking the pair Source-Target as visited before you have proved that such a solution even exists! You should probably mark Source-Intermediate as visited instead.
Even so, you will still get redundant solutions for a pair of elements if there are several different paths between those elements in the graph. This is just how Prolog's logic works: Prolog finds individual answers to your query but does not allow you to talk directly about relationships between those answers. If we want to force it to enumerate everything exactly once, we must leave pure logic.
Some Prolog systems offer a feature called "tabling" which essentially caches all the solutions for a "tabled" predicate and avoids re-computations. This should avoid redundant answers and even simplify your definition: If your closure relation is tabled, you no longer need to track a Visited list because cyclic recomputations will be avoided by the tabling. I can't give you tested code because I have no Prolog with tabling lying around. Even without tabling offered by your system, there is the theoretical possibility of "memoizing" solutions yourself, using Prolog's impure database. It's difficult to get it exactly right with no redundant solutions whatsoever.
As an alternative to impure Prolog, your problem seems a better fit to datalog or answer-set-programming. These are programming models that use Prolog-like syntax but with set semantics that seems to be exactly what you want: A proposition is either a solution or not, there is no concept of redundant solutions. The entire set of solutions is computed in one go. Cycle elimination is also automatic, so you don't need (in fact, because of the restricted input language, cannot use) a Visited list. If I were you, I would try to do this in Datalog.
As a further Prolog extension, there might be a spiffy solution based on Constraint Handling Rules (CHR). But really, do try Datalog.
Finally, I don't see why you think that e2,r2,e2 is a missing solution. The only path from e2 to e2 that I see goes through e3 and back to e2 via an r1 relation, which is the weakest one, so the solution should be e2,r1,e2.

What I ended up with, thanks to also the comments by Lurker and answer by Isabelle is this:
% weaker table
isWeaker( r1, r2).
isWeaker( r2, r3).
weaker( X, Y) :- isWeaker( X, Y).
weaker( X, Y) :-
isWeaker( X, Z),
weaker( Z, Y).
% 'weakest' is <= not <
weakest( X, X, Y) :- =(X,Y).
weakest( X, X, Y) :- weaker( X, Y).
% All direct relations
isADirectRelation( e1, r1, e2).
isADirectRelation( e1, r3, e2).
isADirectRelation( e2, r2, e3).
isADirectRelation( e3, r1, e2).
isADirectRelation( e1, r3, e4).
isADirectRelation( e4, r2, e3).
isADirectRelation( e1, r1, e4).
isADirectRelation( e3, r1, e4).
% derived relations calculations
isARelation( Source, Relation, Target, _) :-
isADirectRelation( Source, Relation, Target).
% Structural Chains
isARelation( Source, Relation, Target, Visited) :-
\+ member( [Source,Relation,Target], Visited),
weakest( Relation, RelationOne, RelationTwo),
isADirectRelation( Source, RelationOne, Intermediate),
isARelation( Intermediate, RelationTwo, Target, [[Source,RelationOne,Intermediate]|Visited]).
isARelation( Source, Relation, Target, Visited) :-
\+ member( [Source,Relation,Target], Visited),
weakest( Relation, RelationOne, RelationTwo),
isADirectRelation( Source, RelationTwo, Intermediate),
isARelation( Intermediate, RelationOne, Target, [[Source,RelationTwo,Intermediate]|Visited]).
write_relation( Result) :-
write( Result), nl.
writeAllRelations :-
setof( (Relation, Source, Target), Relation^isARelation( Source, Relation, Target, []), ListOfAllRelations),
% maplist( write_relation, ListOfAllRelations). % For SWIProlog
write( ListOfAllRelations). % for JIProlog
This works and produces he right outcome:
r1,e1,e2
r1,e1,e3
r1,e1,e4
r1,e2,e2
r1,e2,e3
r1,e2,e4
r1,e3,e2
r1,e3,e3
r1,e3,e4
r1,e4,e2
r1,e4,e3
r1,e4,e4
r2,e1,e3
r2,e2,e3
r2,e4,e3
r3,e1,e2
r3,e1,e4
However, in the real world, with 60 or so entities and 800 or so direct relations, I've not found a Prolog that can handle it. I'll look into Datalog.

Related

Program decomposition and lazy_findall

I like the idea of lazy_findall as it helps me with keeping predicates separated and hence program decomposition.
What are the cons of using lazy_findall and are there alternatives?
Below is my "coroutine" version of the branch and bound problem.
It starts with the problem setup:
domain([[a1, a2, a3],
[b1, b2, b3, b4],
[c1, c2]]).
price(a1, 1900).
price(a2, 750).
price(a3, 900).
price(b1, 300).
price(b2, 500).
price(b3, 450).
price(b4, 600).
price(c1, 700).
price(c2, 850).
incompatible(a2, c1).
incompatible(b2, c2).
incompatible(b3, c2).
incompatible(a2, b4).
incompatible(a1, b3).
incompatible(a3, b3).
Derived predicates:
all_compatible(_, []).
all_compatible(X, [Y|_]) :- incompatible(X, Y), !, fail.
all_compatible(X, [_|T]) :- all_compatible(X, T).
list_price(A, Threshold, P) :- list_price(A, Threshold, 0, P).
list_price([], _, P, P).
list_price([H|T], Threshold, P0, P) :-
price(H, P1),
P2 is P0 + P1,
P2 =< Threshold,
list_price(T, Threshold, P2, P).
path([], []).
path([H|T], [I|Q]) :-
member(I, H),
path(T, Q),
all_compatible(I, Q).
The actual logic:
solution([], Paths, Paths, Value, Value).
solution([C|D], Paths0, Paths, Value0, Value) :-
( list_price(C, Value0, V)
-> ( V < Value0
-> solution(D, [C], Paths, V, Value)
; solution(D, [C|Paths0], Paths, Value0, Value)
)
; solution(D, Paths0, Paths, Value0, Value)
).
The glue
solution(Paths, Value) :-
domain(D),
lazy_findall(P, path(D, P), Paths0),
solution(Paths0, [], Paths, 5000, Value).
Here is an alternative no-lazy-findall solution by #gusbro: https://stackoverflow.com/a/68415760/1646086

I am not familiar with lazy_findall but I observe two "drawbacks" with the presented approach:
The code is not as decoupled as one might want, because there is still a mix of "declarative" and "procedural" code in the same predicate. I am putting quotes around the terms because they can mean a lot of things but here I see that path/2 is concerned with both generating paths AND ensuring that they are valid. Similarly solution/5 (or rather list_price/3-4) is concerned with both computing the cost of paths and eliminating too costly ones with respect to some operational bound.
The "bounding" test only happens on complete paths. This means that in practice all paths are generated and verified in order to find the shortest one. It does not matter for such a small problem but might be important for larger instances. Ideally, one might want to detect for instance that the partial path [a1,?,?] will never bring a solution less than 2900 without trying all values for b and c.
My suggestion is to instead use clpfd (or clpz, depending on your system) to solve both issues. With clpfd, one can first state the problem without concern for how to solve it, then call a predefined predicate (like labeling/2) to solve the problem in a (hopefully) clever way.
Here is an example of code that starts from the same "setup" predicates as in the question.
state(Xs,Total):-
domain(Ds),
init_vars(Ds,Xs,Total),
post_comp(Ds,Xs).
init_vars([],[],0).
init_vars([D|Ds],[X|Xs],Total):-
prices(D,P),
length(D,N),
X in 1..N,
element(X, P, C),
Total #= C + Total0,
init_vars(Ds,Xs,Total0).
prices([],[]).
prices([V|Vs],[P|Ps]):-
price(V,P),
prices(Vs,Ps).
post_comp([],[]).
post_comp([D|Ds],[X|Xs]):-
post_comp0(Ds,D,Xs,X),
post_comp(Ds,Xs).
post_comp0([],_,[],_).
post_comp0([D2|Ds],D1,[X2|Xs],X1):-
post_comp1(D1,1,D2,X1,X2),
post_comp0(Ds,D1,Xs,X1).
post_comp1([],_,_,_,_).
post_comp1([V1|Vs1],N,Vs2,X1,X2):-
post_comp2(Vs2,1,V1,N,X2,X1),
N1 is N+1,
post_comp1(Vs1,N1,Vs2,X1,X2).
post_comp2([],_,_,_,_,_).
post_comp2([V2|Vs2],N2,V1,N1,X2,X1):-
post_comp3(V2,N2,X2,V1,N1,X1),
N3 is N2 + 1,
post_comp2(Vs2,N3,V1,N1,X2,X1).
post_comp3(V2,N2,X2,V1,N1,X1) :-
( ( incompatible(V2,V1)
; incompatible(V1,V2)
)
-> X2 #\= N2 #\/ X1 #\= N1
; true
).
Note that the code is relatively straightforward, except for the (quadruple) loop to post the incompatibility constraints. This is due to the way I wanted to reuse the predicates in the question. In practice, one might want to change the way the data is presented.
The problem can then be solved with the following query (in SWI-prolog):
?- state(Xs, T), labeling([min(T)], Xs).
T = 1900, Xs = [2, 1, 2] ?
In SICStus prolog, one can write instead:
?- state(Xs, T), minimize(labeling([], Xs), T).
Xs = [2,1,2], T = 1900 ?
Another short predicate could then transform back the [2,1,2] list into [a2,b1,c2] if that format was expected.

Finding the path length of an Acyclic Graph in Prolog

Okay, so I have the graph:
and I want to be able to create a rule to find all the paths from X to Y and their lengths (number of edges). For
example, another path from a to e exists via d, f, and g. Its length is 4.
So far my code looks like this:
edge(a,b).
edge(b,e).
edge(a,c).
edge(c,d).
edge(e,d).
edge(d,f).
edge(d,g).
path(X, Y):-
edge(X, Y).
path(X, Y):-
edge(X, Z),
path(Z, Y).
I am a bit unsure how I should approach this. I've entered a lot of rules in that don't work and I am now confused. So, I thought I would bring it back to the basics and see what you guys could come up with. I would like to know why you done what you done also if that's possible. Thank you in advance.

This situation has been asked several times already. Firstly, your edge/2 predicates are incomplete, missing edges like edge(c,d), edge(f,g), or edge(g,e).
Secondly, you need to store the list of already visited nodes to avoid creating loops.
Then, when visiting a new node, you must check that this new node is not yet visited, in the Path variable. However, Path is not yet instanciated. So you need a delayed predicate to check looping (all_dif/1). Here is a simplified version using the lazy implementation by https://stackoverflow.com/users/4609915/repeat.
go(X, Y) :-
path(X, Y, Path),
length(Path, N),
write(Path), write(' '), write(N), nl.
path(X, Y, [X, Y]):-
edge(X, Y).
path(X, Y, [X | Path]):-
all_dif(Path),
edge(X, Z),
path(Z, Y, Path).
%https://stackoverflow.com/questions/30328433/definition-of-a-path-trail-walk
%which uses a dynamic predicate for defining path
%Here is the lazy implementation of loop-checking
all_dif(Xs) :- % enforce pairwise term inequality
freeze(Xs, all_dif_aux(Xs,[])). % (may be delayed)
all_dif_aux([], _).
all_dif_aux([E|Es], Vs) :-
maplist(dif(E), Vs), % is never delayed
freeze(Es, all_dif_aux(Es,[E|Vs])). % (may be delayed)
Here are some executions with comments
?- go(a,e).
[a,b,e] 3 %%% three nodes: length=2
true ;
[a,c,d,f,g,e] 6
true ;
[a,c,f,g,e] 5
true ;
[a,d,f,g,e] 5
true ;
false. %%% no more solutions
Is this a reply to your question ?

Prolog - Path finding and length given Relation

I just began learning Prolog and I wanted to understand Pathfinding better. I have a few examples of relationships, however, I don't know how to find the path and length of a relationships when the relationships are cyclical. I've been trying to create a list that documents visited nodes, but I keep receiving errors.
Below are a few examples as well as my attempt to find path given the relationship, source, target, pathlist, and length):
is_a(parallelogram, quadrilateral).
is_a(trapezoid, quadrilateral).
is_a(rhombus, parallelogram).
is_a(rectangle, parallelogram).
is_a(square, rhombus).
is_a(square, rectangle).
edge(a, b).
edge(b, c).
edge(c, d).
edge(d, a).
friend(alice, bob).
friend(bob, carol).
friend(carol, daniel).
friend(carol, eve).
friends(A,B) :-
friend(A,B);
friend(B,A).
transit(Rel, S, T) :-
call(Rel, S, X),
(X = T; transit(Rel, X, T)).
path_(Rel,Source,Target,Path,Len) :-
path_(Rel,Source,Target,Path,Len,[]).
path_(Rel,Source,Target,Path,Len,Visited) :-
transit(Rel,Source,Target),
transit(Rel,Source,Mid),
Mid == Target, !,
append(Visited,[Source,Target],Path),
length(Path,L),
Len is L+1.
path_(Rel,Source,Target,Path,Len,Visited) :-
transit(Rel,Source,Target),
transit(Rel,Source,Mid),
not(member(Mid,Visited)),
path_(Rel,Mid,Target,Path,Len,[Source|Visited]).
The above is my second attempt, but I receive errors on everything. My first attempt only worked with non-cyclical paths, such as for the is_a relationships, which is noted below:
path0(Rel,From,To,[From,To],2) :-
transit(Rel,From,To),
call(Rel, From, To).
path0(Rel,From,To,[From|Xs],Len) :-
transit(Rel,From,X),
call(Rel,From,X),
path0(Rel,X,To,Xs,_),
length(Xs, L),
Len is L+1.

Memorising (and caching) solutions found in a Prolog query?

In this question on StackExchange I've asked (and it has been solved) about a Prolog program I have been trying to create. But while it works in principle, it doesn't scale to my real world need.
Before I start learning yet another language (Datalog), I'd like to try my already done work and know how I can implement in Prolog a way to memorise results from earlier queries such that the same query is only executed once. So, I'm looking for a way to add the result of a successful query to a List and if that same query is asked again, it doesn't redo the calculation, but uses the remembered result.
My main problem is that I cannot find a way to keep the result of a successful query in a list that is passed 'up the chain'.
In
% get this out of the way, quickly
isARelation( Source, Relation, Target, _) :-
isADirectRelation( Source, Relation, Target).
% Structural Chains
isARelation( Source, Relation, Target, Visited) :-
\+ member( [Source,Relation,Target], Visited),
structuralOrDependencyRelation( RelationOne),
structuralOrDependencyRelation( RelationTwo),
weakest( Relation, RelationOne, RelationTwo),
isADirectRelation( Source, RelationOne, Intermediate),
isARelation( Intermediate, RelationTwo, Target, [[Source,RelationOne,Intermediate]|Visited]).
isARelation( Source, Relation, Target, Visited) :-
\+ member( [Source,Relation,Target], Visited),
structuralOrDependencyRelation( RelationOne),
structuralOrDependencyRelation( RelationTwo),
weakest( Relation, RelationOne, RelationTwo),
isADirectRelation( Source, RelationTwo, Intermediate),
isARelation( Intermediate, RelationOne, Target, [[Source,RelationTwo,Intermediate]|Visited]).
How do I implement that the first call
isARelation(A, B, C, []).
does the calculation of the results, and a second call
isARelation(A, B, C, []).
uses the earlier found result, which is kept 'globally'?

This is not really an answer to your question :(
The other answer has the right idea, but the implementation has a problem. Let's say we want to make a memoized version of squaring a number:
:- dynamic mem_square/2.
square(N, S) :-
( mem_square(N, S)
; S is N*N,
assertz(mem_square(N, S))
).
BTW, the parentheses in the other answer are completely unnecessary. These are also unnecessary, but this is how you usually wrap a disjunction, just in case it is part of a conjunction. In other words, this: a ; (b, c) is the same as a ; b, c, but (a ; b), c is not the same.
Now, if I load this program from the toplevel and query:
?- square(3, S).
S = 9. % first time it's fine
?- square(3, S).
S = 9 ;
S = 9. % now there's two
?- square(3, S).
S = 9 ;
S = 9 ;
S = 9. % now three
If you keep on querying a memoized fact, and you backtrack into it, you will keep on computing again and again and adding more and more identical copies of it. Instead, you can try for example this:
:- dynamic mem_square/2.
square(N, S) :-
( mem_square(N, S)
-> true
; S is N*N,
assertz(mem_square(N, S))
).
Now, there is no choice point.
This is still not a clean implementation if you are meant to have choice multiple solutions. Any solutions after the first will be cut by the ->.

This is advice on how to generically do what tabling does. I haven't followed this advice in ages myself so there may be inaccuracies here. Hopefully the rest of the gang will show up and correct me if I'm off-base.
You have a predicate foo/4 that is inefficient.
Add this to your file:
:- dynamic(cached_foo/4).
Rename foo/4 to compute_foo/4 or something.
Make a new predicate foo/4 that looks like this:
foo(X, Y, Z, Q) :-
cached_foo(X, Y, Z, Q) ;
(
compute_foo(X, Y, Z, Q),
assertz(cached_foo(X, Y, Z, Q))
).

Making a STRIPS Planner using BFS in Prolog

I have the following generic Breadth-first search code for Prolog and I would like to take the simple node representation s(a,b,go(a,b)) and change it to a predicate so that go(a,b) will represent a STRIPS operator say stack(A,B) so that I might have two predicates: s(S0,S,stack(A,B)) and s(S0,S,unstack(B,A)) (classic blocks world problem) which can be used by the breadth-first search below. I'm not sure if this is possible or how I would go about doing it. My first idea was to have a predicate as follows:
% S0 is the given state and S is the successor for the 'stack(A,B)' predicate if S0
% A and B are not the same blocks, and we want the next state S to contain the
% same state/preconditions information except we want to add 'on(A,B)'
% to S and we want to remove 'clear(B)' from S0
s(S0,S,stack(A,B)) :-
A \== B,
% other predicates etc
The breadth-first search is given below.
:- use_module(library(lists)).
% bfs(?initial_state, ?goal_state, ?solution)
% does not include cycle detection
bfs(X,Y,P) :-
bfs_a(Y,[n(X,[])],R),
reverse(R,P).
bfs_a(Y,[n(Y,P)|_],P).
bfs_a(Y,[n(S,P1)|Ns],P) :-
findall(n(S1,[A|P1]),s(S,S1,A),Es),
append(Ns,Es,O),
bfs_a(Y,O,P).
% s(?state, ?next_state, ?operator).
s(a,b,go(a,b)).
s(a,c,go(a,c)).

bfs(S0,Goal,Plan) :-
bfs_a(Goal1,[n(S0,[])],R),
subset(Goal,Goal1),
reverse(R,Plan).
bfs_a(Y,[n(Y,P)|_],P).
bfs_a(Y,[n(S,P1)|Ns],P) :-
findall(n(S1,[A|P1]), s(S,S1,A), Es),
append(Ns,Es,O),
bfs_a(Y,O,P).
s(State,NextState,Operation) :-
opn(Operation, PreList), subset(PreList, State),
deletes(Operation, DelList), subtract(State, DelList, TmpState),
adds(Operation, AddList), union(AddList, TmpState, NextState).
subset([ ],_).
subset([H|T],List) :-
member(H,List),
subset(T,List).
opn(move(Block,X1,X2),[clear(Block),clear(X2),on(Block,X1)]) :-
block(Block),object(X1),object(X2),
Block \== X1, X2 \== Block, X1 \== X2.
adds(move(Block,X1,X2),[on(Block,X2),clear(X1)]).
deletes(move(Block,X1,X2),[on(Block,X1),clear(X2)]).
object(X) :- place(X) ; block(X).
block(a).
block(b).
block(c).
block(d).
place(x1).
place(x2).
place(x3).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio