Given the following facts in a database:
foo(a, 3).
foo(b, 2).
foo(c, 4).
foo(d, 3).
foo(e, 2).
foo(f, 6).
foo(g, 3).
foo(h, 2).
I want to collect all first arguments that have the smallest second argument, plus the value of the second argument. First try:
find_min_1(Min, As) :-
setof(B-A, foo(A, B), [Min-_|_]),
findall(A, foo(A, Min), As).
?- find_min_1(Min, As).
Min = 2,
As = [b, e, h].
Instead of setof/3, I could use aggregate/3:
find_min_2(Min, As) :-
aggregate(min(B), A^foo(A, B), Min),
findall(A, foo(A, Min), As).
?- find_min_2(Min, As).
Min = 2,
As = [b, e, h].
NB
This only gives the same results if I am looking for the minimum of a number. If an arithmetic expression in involved, the results might be different. If a non-number is involved, aggregate(min(...), ...) will throw an error!
Or, instead, I can use the full key-sorted list:
find_min_3(Min, As) :-
setof(B-A, foo(A, B), [Min-First|Rest]),
min_prefix([Min-First|Rest], Min, As).
min_prefix([Min-First|Rest], Min, [First|As]) :-
!,
min_prefix(Rest, Min, As).
min_prefix(_, _, []).
?- find_min_3(Min, As).
Min = 2,
As = [b, e, h].
Finally, to the question(s):
Can I do this directly with library(aggregate)? It feels like it should be possible....
Or is there a predicate like std::partition_point from the C++ standard library?
Or is there some easier way to do this?
EDIT:
To be more descriptive. Say there was a (library) predicate partition_point/4:
partition_point(Pred_1, List, Before, After) :-
partition_point_1(List, Pred_1, Before, After).
partition_point_1([], _, [], []).
partition_point_1([H|T], Pred_1, Before, After) :-
( call(Pred_1, H)
-> Before = [H|B],
partition_point_1(T, Pred_1, B, After)
; Before = [],
After = [H|T]
).
(I don't like the name but we can live with it for now)
Then:
find_min_4(Min, As) :-
setof(B-A, foo(A, B), [Min-X|Rest]),
partition_point(is_min(Min), [Min-X|Rest], Min_pairs, _),
pairs_values(Min_pairs, As).
is_min(Min, Min-_).
?- find_min_4(Min, As).
Min = 2,
As = [b, e, h].
What is the idiomatic approach to this class of problems?
Is there a way to simplify the problem?
Many of the following remarks could be added to many programs here on SO.
Imperative names
Every time, you write an imperative name for something that is a relation you will reduce your understanding of relations. Not much, just a little bit. Many common Prolog idioms like append/3 do not set a good example. Think of append(As,As,AsAs). The first argument of find_min(Min, As) is the minimum. So minimum_with_nodes/2 might be a better name.
findall/3
Do not use findall/3 unless the uses are rigorously checked, essentially everything must be ground. In your case it happens to work. But once you generalize foo/2 a bit, you will lose. And that is frequently a problem: You write a tiny program ; and it seems to work.
Once you move to bigger ones, the same approach no longer works. findall/3 is (compared to setof/3) like a bull in a china shop smashing the fine fabric of shared variables and quantification. Another problem is that accidental failure does not lead to failure of findall/3 which often leads to bizarre, hard to imagine corner cases.
Untestable, too specific program
Another problem is somewhat related to findall/3, too. Your program is so specific, that it is quite improbable that you will ever test it. And marginal changes will invalidate your tests. So you will soon give up to perform testing. Let's see what is specific: Primarily the foo/2 relation. Yes, only an example. Think of how to set up a test configuration where foo/2 may change. After each change (writing a new file) you will have to reload the program. This is so complex, chances are you will never do it. I presume you do not have a test harness for that. Plunit for one, does not cover such testing.
As a rule of thumb: If you cannot test a predicate on the top level you never will. Consider instead
minimum_with(Rel_2, Min, Els)
With such a relation, you can now have a generalized xfoo/3 with an additional parameter, say:
xfoo(o, A,B) :-
foo(A,B).
xfoo(n, A,B) :-
newfoo(A,B).
and you most naturally get two answers for minimum_with(xfoo(X), Min, Els). Would you have used findall/3 instead of setof/3 you already would have serious problems. Or just in general: minmum_with(\A^B^member(A-B, [x-10,y-20]), Min, Els). So you can play around on the top level and produce lots of interesting test cases.
Unchecked border cases
Your version 3 is clearly my preferred approach, however there are still some parts that can be improved. In particular, if there are answers that contain variables as a minimum. These should be checked.
And certainly, also setof/3 has its limits. And ideally you would test them. Answers should not contain constraints, in particular not in the relevant variables. This shows how setof/3 itself has certain limits. After the pioneering phase, SICStus produced many errors for constraints in such cases (mid 1990s), later changed to consequently ignoring constraints in built-ins that cannot handle them. SWI on the other hand does entirely undefined things here. Sometimes things are copied, sometimes not. As an example take:
setof(A, ( A in 1..3 ; A in 3..5 ), _) and setof(t, ( A in 1..3 ; A in 3.. 5 ), _).
By wrapping the goal this can be avoided.
call_unconstrained(Goal_0) :-
call_residue_vars(Goal_0, Vs),
( Vs = [] -> true ; throw(error(representation_error(constraint),_)) ).
Beware, however, that SWI has spurious constraints:
?- call_residue_vars(all_different([]), Xs).
Xs = [_A].
Not clear if this is a feature in the meantime. It has been there since the introduction of call_residue_vars/2 about 5 years ago.
I don't think that library(aggregate) covers your use case. aggregate(min) allows for one witness:
min(Expr, Witness)
A term min(Min, Witness), where Min is the minimal version of Expr over all solutions, and Witness is any other template applied to solutions that produced Min. If multiple solutions provide the same minimum, Witness corresponds to the first solution.
Some time ago, I wrote a small 'library', lag.pl, with predicates to aggregate with low overhead - hence the name (LAG = Linear AGgregate). I've added a snippet, that handles your use case:
integrate(min_list_associated, Goal, Min-Ws) :-
State = term(_, [], _),
forall(call(Goal, V, W), % W stands for witness
( arg(1, State, C), % C is current min
arg(2, State, CW), % CW are current min witnesses
( ( var(C) ; V #< C )
-> U = V, Ws = [W]
; U = C,
( C == V
-> Ws = [W|CW]
; Ws = CW
)
),
nb_setarg(1, State, U),
nb_setarg(2, State, Ws)
)),
arg(1, State, Min), arg(2, State, Ws).
It's a simple minded extension of integrate(min)...
The comparison method it's surely questionable (it uses less general operator for equality), could be worth to adopt instead a conventional call like that adopted for predsort/3. Efficiency wise, still better would be to encode the comparison method as option in the 'function selector' (min_list_associated in this case)
edit thanks #false and #Boris for correcting the bug relative to the state representation. Calling nb_setarg(2, State, Ws) actually changes the term' shape, when State = (_,[],_) was used. Will update the github repo accordingly...
Using library(pairs) and [sort/4], this can be simply written as:
?- bagof(B-A, foo(A, B), Ps),
sort(1, #=<, Ps, Ss), % or keysort(Ps, Ss)
group_pairs_by_key(Ss, [Min-As|_]).
Min = 2,
As = [b, e, h].
This call to sort/4 can be replaced with keysort/2, but with sort/4 one can also find for example the first arguments associated with the largest second argument: just use #>= as the second argument.
This solution is probably not as time and space efficient as the other ones, but may be easier to grok.
But there is another way to do it altogether:
?- bagof(A, ( foo(A, Min), \+ ( foo(_, Y), Y #< Min ) ), As).
Min = 2,
As = [b, e, h].
Related
I am developing a path finding algorithm in Prolog, giving all nodes accessible by a path from a starting node. To avoid duplicate paths, visited nodes are kept in a list.
Nodes and neighbors are defined as below:
node(a).
node(b).
node(c).
node(d).
node(e).
edge(a,b).
edge(b,c).
edge(c,d).
edge(b,d).
neighbor(X,Y) :- edge(X,Y).
neighbor(X,Y) :- edge(Y,X).
The original algorithm below works fine:
path2(X,Y) :-
pathHelper(X,Y,[X]).
pathHelper(X,Y,L) :-
neighbor(X,Y),
\+ member(Y,L).
pathHelper(X,Y,H) :-
neighbor(X,Z),
\+ member(Z,H),
pathHelper(Z,Y,[Z|H]).
This works fine
[debug] ?- path2(a,X).
X = b ;
X = c ;
X = d ;
X = d ;
X = c ;
false.
however, when changing the order of the two clauses in the second definition, such as below
pathHelper(X,Y,L) :-
\+ member(Y,L),
neighbor(X,Y).
When trying the same here, swipl returns the following:
[debug] ?- path2(a,X).
false.
The query doesn't work anymore, and only returns false. I have tried to understand this through the tracecommand, but still can't make sense of what exactly is wrong.
In other words, I am failing to understand why the order of neighbor(X,Y)and \+ member(Y,L)is crucial here. It makes sense to have neighbor(X,Y) first in terms of efficiency, but not in terms of correctness to me.
You are now encountering the not so clean-cut borders of pure Prolog and its illogical surroundings. Welcome to the real world.
Or rather, not welcome! Instead, let's try to improve your definition. The key problem is
\+ member(Y, [a]), Y = b.
which fails while
Y = b, \+ member(Y,[a]).
succeeds. There is no logic to justify this. It's just the operational mechanism of Prolog's built-in (\+)/1.
Happily, we can improve upon this. Enter non_member/2.
non_member(_X, []).
non_member(X, [E|Es]) :-
dif(X, E),
non_member(X, Es).
Now,
?- non_member(Y, [a]).
dif(Y,a).
Mark this answer, it says: Yes, Y is not an element of [a], provided Y is different from a. Think of the many solutions this answer includes, like Y = 42, or Y = b and infinitely many more such solutions that are not a. Infinitely many solutions captured in nine characters!
Now, both non_member(Y, [a]), Y = b and Y = b, non_member(Y, [a]) succeed. So exchanging them has only influence on runtime and space consumption. If we are at it, note that you check for non-memberness in two clauses. You can factor this out. For a generic solution to this, see closure/3. With it, you simply say: closure(neighbor, A,B).
Also consider the case where you have only edge(a,a). Your definition fails here for path2(a,X). But shouldn't this rather succeed?
And the name path2/2 is not that fitting, rather reserve this word for an actual path.
The doubt you have is related to how prolog handle negation. Prolog uses negation as failure. This means that, if prolog has to negate a goal g (indicate it with not(g)), it tries to prove g by executing it and then, if the g fails, not(g) (or \+ g, i.e. the negation of g) succeeds and viceversa.
Keep in mind also that, after the execution of not(g), if the goal has variables, they will not be instantiated. This because prolog should instantiate the variables with all the terms that makes g fail, and this is likely an infinite set (for example for a list, not(member(A,[a]) should instantiate the variable A with all the elements that are not in the list).
Let's see an example. Consider this simple program:
test:-
L = [a,b,c],
\+member(A,L),
writeln(A).
and run it with ?- trace, test. First of all you get a Singleton variable in \+: A warning for the reason i explained before, but let's ignore it and see what happens.
Call:test
Call:_5204=[a, b]
Exit:[a, b]=[a, b]
Call:lists:member(_5204, [a, b])
Exit:lists:member(a, [a, b]) % <-----
Fail:test
false
You see at the highlighted line that the variable A is instantiated to a and so member/2 succeeds and so \+ member(A,L) is false.
So, in your code, if you write pathHelper(X,Y,L) :- \+ member(Y,L), neighbor(X,Y)., this clause will always fail because Y is not sufficiently instantiated. If you swap the two terms, Y will be ground and so member/2 can fail (and \+member succeeds).
My ultimate goal is to make a reified version of automaton/3, that freezes if there are any variables in the sequence passed to it. i.e. I dont want the automaton to instantiate variables.
(fd_length/3, if_/3 etc as defined by other people here on so).
To start with I have a reified test for single variables:
var_t(X,T):-
var(X) ->
T=true;
T=false.
This allows me to implement:
if_var_freeze(X,Goal):-
if_(var_t(X),freeze(X,Goal),Goal).
So I can do something like:
?-X=bob,Goal =format("hello ~w\n",[X]),if_var_freeze(X,Goal).
Which will behave the same as:
?-Goal =format("hello ~w\n",[X]),if_var_freeze(X,Goal),X=bob.
How do I expand this to work on a list of variables so that Goal is only called once, when all the vars have been instantiated?
In this method if I have more than one variable I can get this behaviour which I don't want:
?-List=[X,Y],Goal = format("hello, ~w and ~w\n",List),
if_var_freeze(X,Goal),
if_var_freeze(Y,Goal),X=bob.
hello, bob and _G3322
List = [bob, Y],
X = bob,
Goal = format("hello, ~w and ~w\n", [bob, Y]),
freeze(Y, format("hello, ~w and ~w\n", [bob, Y])).
I have tried:
freeze_list(List,Goal):-
freeze_list_h(List,Goal,FrozenList),
call(FrozenList).
freeze_list_h([X],Goal,freeze(X,Goal)).
freeze_list_h(List,Goal,freeze(H,Frozen)):-
List=[H|T],
freeze_list_h(T,Goal,Frozen).
Which works like:
?- X=bob,freeze_list([X,Y,Z],format("Hello ~w, ~w and ~w\n",[X,Y,Z])),Y=fred.
X = bob,
Y = fred,
freeze(Z, format("Hello ~w, ~w and ~w\n", [bob, fred, Z])) .
?- X=bob,freeze_list([X,Y,Z],format("Hello ~w, ~w and ~w\n",[X,Y,Z])),Y=fred,Z=sue.
Hello bob, fred and sue
X = bob,
Y = fred,
Z = sue .
Which seems okay, but I am having trouble applying it to automaton/3.
To reiterate the aim is to make a reified version of automaton/3, that freezes if there are any variables in the sequence passed to it. i.e. I don't want the automaton to instantiate variables.
This is what I have:
ga(Seq,G) :-
G=automaton(Seq, [source(a),sink(c)],
[arc(a,0,a), arc(a,1,b),
arc(b,0,a), arc(b,1,c),
arc(c,0,c), arc(c,1,c)]).
max_seq_automaton_t(Max,Seq,A,T):-
Max #>=L,
fd_length(Seq,L),
maplist(var_t,Seq,Var_T_List), %find var_t for each member of seq
maplist(=(false),Var_T_List), %check that all are false i.e no uninstaninated vars
call(A),!,
T=true.
max_seq_automaton_t(Max,Seq,A,T):-
Max #>=L,
fd_length(Seq,L),
maplist(var_t,Seq,Var_T_List), %find var_t for each member of seq
maplist(=(false),Var_T_List), %check that all are false i.e no uninstaninated vars
\+call(A),!,
T=false.
max_seq_automaton_t(Max,Seq,A,true):-
Max #>=L,
fd_length(Seq,L),
maplist(var_t,Seq,Var_T_List), %find var_t for each
memberd_t(true,Var_T_List,true), %at least one var
freeze_list_h(Seq,A,FrozenList),
call(FrozenList),
call(A).
max_seq_automaton_t(Max,Seq,A,false):-
Max #>=L,
fd_length(Seq,L),
maplist(var_t,Seq,Var_T_List), %find var_t for each
memberd_t(true,Var_T_List,true), %at least one var
freeze_list_h(Seq,A,FrozenList),
call(FrozenList),
\+call(A).
Which does not work, The following goal should be frozen until X is instantiated:
?- Seq=[X,1],ga(Seq,A),max_seq_automaton_t(3,Seq,A,T).
Seq = [1, 1],
X = 1,
A = automaton([1, 1], [source(a), sink(c)], [arc(a, 0, a), arc(a, 1, b), arc(b, 0, a), arc(b, 1, c), arc(c, 0, c), arc(c, 1, c)]),
T = true
Update This is what I now have which I think works as I originally intended but I am digesting what #Mat has said to think if this is actually what I want. Will update further tomorrow.
goals_to_conj([G|Gs],Conj) :-
goals_to_conj_(Gs,G,Conj).
goals_to_conj_([],G,nonvar(G)).
goals_to_conj_([G|Gs],G0,(nonvar(G0),Conj)) :-
goals_to_conj_(Gs,G,Conj).
max_seq_automaton_t(Max,Seq,A,T):-
Max #>=L,
fd_length(Seq,L),
maplist(var_t,Seq,Var_T_List), %find var_t for each member of seq
maplist(=(false),Var_T_List), %check that all are false i.e no uninstaninated vars
call(A),!,
T=true.
max_seq_automaton_t(Max,Seq,A,T):-
Max #>=L,
fd_length(Seq,L),
maplist(var_t,Seq,Var_T_List), %find var_t for each member of seq
maplist(=(false),Var_T_List), %check that all are false i.e no uninstaninated vars
\+call(A),!,
T=false.
max_seq_automaton_t(Max,Seq,A,T):-
Max #>=L,
fd_length(Seq,L),
maplist(var_t,Seq,Var_T_List), %find var_t for each
memberd_t(true,Var_T_List,true), %at least one var
goals_to_conj(Seq,GoalForWhen),
when(GoalForWhen,(A,T=true)).
max_seq_automaton_t(Max,Seq,A,T):-
Max #>=L,
fd_length(Seq,L),
maplist(var_t,Seq,Var_T_List), %find var_t for each
memberd_t(true,Var_T_List,true), %at least one var
goals_to_conj(Seq,GoalForWhen),
when(GoalForWhen,(\+A,T=false)).
In my view, you are making great progress with Prolog. At this point it makes sense to proceed a bit more prudently though. All the things you are asking for can, in principle, be solved easily. You only need a generalization of freeze/2, which is available as when/2.
However, let us take a step back and more deeply consider what is actually going on here.
Declaratively, when we state a constraint, we mean that it holds. We do not mean "It holds only when everything is instantiated", because that would reduce the constraint to a mere checker, leading to a "generate-and-test" approach. The point of constraints is exactly to prune whenever possible, leading to a much reduced search space in many cases.
Exactly the same holds for reified constraints. When we post a reified constraint, we state that the reification holds. Not only in cases where everything is instantiated, but always. The point is exactly that the (reified) constraint can be used in all directions. If the constraint that is being reified is already entailed, we get to know it. Likewise, if it cannot hold, we get to know it. If either possibility may be the case, we need to search explicitly for solutions, or determine that none exist. If we want to insist that the constraint that is being reified holds, it is easily possible; etc.
However, the point in all cases is exactly that we can focus on the declarative semantics of the constraint, very free from extra-logical, procedural considerations like what is being instantiated and when. If I answered your literal question, it would move you closer to operational considerations, much closer than you probably need or want in actuality.
Therefore, I am not going to answer your literal question. But I will give you a solution to your actual, underlying issue.
The point is to reifiy automaton/3. A constraint reification will not by itself prune anything as long as it is open whether the constraint that is being reified actually holds or not. Only when we insist that the constraint that is being reified holds does propagation occur.
It is easy to reify automaton/3, by reifying the conjunction of constraints that constitute its decomposition. Here is one way to do it, based on code that is freely available in SWI-Prolog:
:- use_module(library(clpfd)).
automaton(Vs, Ns, As, T) :-
must_be(list(list), [Vs,Ns,As]),
include_args1(source, Ns, Sources),
include_args1(sink, Ns, Sinks),
phrase((arcs_relation(As, Relation),
nodes_nums(Sinks, SinkNums0),
nodes_nums(Sources, SourceNums0)), [[]-0], _),
phrase(transitions(Vs, Start, End), Tuples),
list_to_drep(SinkNums0, SinkDrep),
list_to_drep(SourceNums0, SourceDrep),
( Start in SourceDrep #/\
End in SinkDrep #/\
tuples_in(Tuples, Relation)) #<==> T.
include_args1(Goal, Ls0, As) :-
include(Goal, Ls0, Ls),
maplist(arg(1), Ls, As).
list_to_drep([L|Ls], Drep) :-
foldl(drep_, Ls, L, Drep).
drep_(L, D0, D0\/L).
transitions([], S, S) --> [].
transitions([Sig|Sigs], S0, S) --> [[S0,Sig,S1]],
transitions(Sigs, S1, S).
nodes_nums([], []) --> [].
nodes_nums([Node|Nodes], [Num|Nums]) -->
node_num(Node, Num),
nodes_nums(Nodes, Nums).
arcs_relation([], []) --> [].
arcs_relation([arc(S0,L,S1)|As], [[From,L,To]|Rs]) -->
node_num(S0, From),
node_num(S1, To),
arcs_relation(As, Rs).
node_num(Node, Num), [Nodes-C] --> [Nodes0-C0],
{ ( member(N-I, Nodes0), N == Node ->
Num = I, C = C0, Nodes = Nodes0
; Num = C0, C is C0 + 1, Nodes = [Node-C0|Nodes0]
) }.
sink(sink(_)).
source(source(_)).
Note that this propagates nothing whatsoever as long as T is unknown.
I now use the following definition for a few sample queries:
seq(Seq, T) :-
automaton(Seq, [source(a),sink(c)],
[arc(a,0,a), arc(a,1,b),
arc(b,0,a), arc(b,1,c),
arc(c,0,c), arc(c,1,c)], T).
Examples:
?- seq([X,1], T).
Result (omitted): Constraints are posted, nothing is propagated.
Next example:
?- seq([X,1], T), X = 3.
X = 3,
T = 0.
Clearly, the reified automaton/3 constraint does not hold in this case. However, the reifying constraint of course still holds, as always, and this is the reason why T=0 in this case.
Next example:
?- seq([1,1], T), indomain(T).
T = 0 ;
T = 1.
Oh-oh! What is going on here? How can it be that the constraint is both true and false? This is because we do not see all constraints that are actually posted in this example. Use call_residue_vars/2 to see the whole truth.
In fact, try it on the simpler example:
?- call_residue_vars(seq([1,1],0), Vs).
The pending residual constraints that still need to be satisfied in this case are:
_G1496 in 0..1,
_G1502#/\_G1496#<==>_G1511,
tuples_in([[_G1505,1,_G1514]], [[0,0,0],[0,1,1],[1,0,0],[1,1,2],[2,0,2], [2,1,2]])#<==>_G825,
tuples_in([[_G831,1,_G827]], [[0,0,0],[0,1,1],[1,0,0],[1,1,2],[2,0,2],[2,1,2]])#<==>_G826,
_G829 in 0#<==>_G830,
_G830 in 0..1,
_G830#/\_G828#<==>_G831,
_G828 in 0..1,
_G827 in 2#<==>_G828,
_G829 in 0..1,
_G829#/\_G826#<==>0,
_G826 in 0..1,
_G825 in 0..1
So, the above only holds if these constraints, which are said to still flounder, also hold.
Here is an auxiliary definition that helps you label remaining finite domain variables. It suffices for this example:
finite(V) :-
fd_dom(V, L..U),
dif(L, inf),
dif(U, sup).
We can now paste back the residual program (which consists of CLP(FD) constraints), and use label_fixpoint/1 to label variables whose domain is finite:
?- Vs0 = [_G1496, _G1499, _G1502, _G1505, _G1508, _G1511, _G1514, _G1517, _G1520, _G1523, _G1526],
_G1496 in 0..1,
_G1502#/\_G1496#<==>_G1511,
tuples_in([[_G1505,1,_G1514]], [[0,0,0],[0,1,1],[1,0,0],[1,1,2],[2,0,2], [2,1,2]])#<==>_G825,
tuples_in([[_G831,1,_G827]], [[0,0,0],[0,1,1],[1,0,0],[1,1,2],[2,0,2],[2,1,2]])#<==>_G826,
_G829 in 0#<==>_G830, _G830 in 0..1,
_G830#/\_G828#<==>_G831, _G828 in 0..1,
_G827 in 2#<==>_G828, _G829 in 0..1,
_G829#/\_G826#<==>0, _G826 in 0..1, _G825 in 0..1,
include(finite, Vs0, Vs),
label(Vs).
Note that we cannot directly use labeling in the original program, i.e., we cannot do:
?- call_residue_vars(seq([1,1],0), Vs), <label subset of Vs>.
because call_residue_vars/2 also brings internal variables to the surface that, although they have a domain assigned and look like regular CLP(FD) variables, are not meant to directly participate in any labeling.
In contrast, the residual program can be used without any problem for further reasoning, and it is in fact meant to be used that way.
In this concrete case, after labeling the variables whose domains are still finite in the case above, some constraints still remain. They are of the form:
tuples_in([[_G1487,1,_G1496]], [[0,0,0],[0,1,1],[1,0,0],[1,1,2],[2,0,2],[2,1,2]])#<==>_G1518
Exercise: Does it follow from this, however indirectly, that the original query, i.e., seq([1,1],0), cannot hold?
So, to summarize:
Constraint reification does not in itself cause propagation of the constraint that is being reified.
Constraint reification often lets you detect that a constraint cannot hold.
In general, CLP(FD) propagation is necessarily incomplete, i.e., we cannot be sure that there is a solution just because our query succeeds.
labeling/2 lets you see whether there are concrete solutions, if domains are finite.
To see all pending constraints, wrap your query in call_residue_vars/2.
As long as pending constraints remain, it is only a conditional answer.
Recommendation: To make sure that no floundering constraints remain, wrap your query in call_residue_vars/2 and look for any residual constraints on the toplevel.
Consider using the widely available prolog-coroutining predicate when/2 (for details, consider reading the SICStus Prolog manual page on when/2).
Note that you can, in principle, implement freeze/2 like this:
freeze(V,Goal) :-
when(nonvar(V),Goal).
What you are implementing appears to me a variation of the following:
delayed_until_ground_t(Goal,T) :-
( ground(Goal)
-> ( call(Goal)
-> T = true
; T = false
)
; T = true, when(ground(Goal),once(Goal))
; T = false, when(ground(Goal), \+(Goal))
).
Delaying goals can be a really nice feature, but be aware of the perils of delaying forever.
Make sure to read and digest the above answer by #mat regarding call_residue_vars/2!
I am very new to prolog and although I’ve read some books I can definitely tell that my programming brain can’t think the Prolog way. The problem I would like to solve is pretty simple (I believe). I will describe it via an example.
Let’s say that I have a graph that contains 4 “types” of nodes and 3 edges that connect the nodes. The types can be A, B, C or D and as you can see from the image below (see Figure 1), A can be connected with B and C (A_To_B and A_To_C edges respectively), while C can be connected to D (C_To_D edge). There’s also an additional rule not shown on the picture: A can be connected to at most 1 C.
I would like to express these simple rules in Prolog to solve the problem shown in the second picture. There are 3 nodes which type is missing (labeled X?, Y? and Z?). By applying the above rules in my mind I can easily find that X? and Z? are of B type (as A can connect to no more than 1 Cs) and Y? is of type D as C can only connect to D.
Could please provide me any help on that? I am not writing just to pick the solution. I would like to learn Prolog as well so any suggestion on a book that explains Prolog to people who have never worked on such concepts before like me would be very welcome.
EDIT: Example that fails
I came up with the following two examples:
For example 1, the rules are
can_connect(a,b,_).
can_connect(a,c,1).
link(1,2).
type(1,a).
type(2,_).
The possible solutions returned are [b,c] which is correct as we request at most 1 link from A to C meaning that 0 links is also acceptable.
In example 2 the rules change to the following:
can_connect(a,b,_).
can_connect(a,c,**2**).
link(1,2).
link(1,3).
type(1,a).
type(2,_).
type(3,c).
Running the code here returns [c] which is wrong. b is also an acceptable solution as we require again at most 2 A to C links which means that having only 1 is OK.
I spent this weekend trying to figure out the solution. First of all, I believe that it works as intended in Example 1 simply because there's no link from A to C instantiated in the proposed solution (where checking if 2 can be b), so the can_connect(a,c,1) is not checked so the proposed solution is getting accepted. In Example 2, there's one A to C link already there so the can_connect(a,c,2) is checked and the solution where node 2 has type b is rejected as the rule checks if there are exactly 2 and not at most 2 links from A to C.
I find a solution which works at these scenarios but fails at some others. Here it is:
% value #3 is the lower bound and #4 is the upper bound.
can_connect(a,b,0,500).
% A C node can be connected by 0, 1 or 2 A nodes
can_connect(a,c,0,2).
can_connect(d,c,1,1).
can_connect(c,e,0,1).
%The same as previous solution
link(1,2).
link(1,3).
% No change here
type(1,a).
type(2,_).
type(3,c).
% No change here
node_type(N, NT) :-
type(N, NT),
nonvar(NT),
!. % assume a node has only one type
% No change here
node_type(N, NT) :-
assoc_types(Typed),
maplist(check_connections(Typed), Typed),
memberchk(N:NT, Typed).
% No change here
assoc_types(Typed) :-
findall(N, type(N, _), L),
maplist(typed, L, Typed).
% No change here
typed(N, N:T) :-
type(N, T),
member(T, [a,b,c]).
% Changes here
check_connections(Graph, N:NT) :-
forall(link(N, M), (
memberchk(M:MT, Graph),
can_connect(NT, MT, L, U),
findall(X, (link(N, X), memberchk(X:MT, Graph)), Ts),
mybetween(L, U, Ts),
forall(can_connect(NT, Y, LM, UM), (
findall(P, (link(N,P),memberchk(P:Y, Graph)), Ss),
length(Ss, SsSize ),
SsSize>=LM,
SsSize=<UM
))
)).
% It is used to find if the length of a list is between two limits.
mybetween(Lower, Upper, MyList) :-
length(MyList, MySize),
MySize=<Upper,
MySize>=Lower.
This solution fails in this example
In this example, X? must be always b, Y? must always be C and Z? must always be D. It finds X? and Y? correctly but not Z?. I believe after some debugging that this is due the fact that in the current implementation I only check the can_connect rules that are related with links that start from a node and not that end to a node. However, I am not sure at all about that.
Any help is appreciated.
the representation of the problem needs to disambiguate nodes names, so we can express the links appropriately
now we can write
can_connect(a,b,_).
can_connect(a,c,1).
can_connect(c,d,_).
link(1,2).
link(1,3).
link(1,4).
link(4,5).
link(4,6).
link(7,4).
link(7,8).
type(1,a).
type(2,b).
type(3,_).
type(4,c).
type(5,d).
type(6,_).
type(7,a).
type(8,_).
The underscore (anonymous variable) in Prolog plays a role similar to NULL in SQL, it can assume any value.
So, a first snippet
node_type(N, NT) :- type(N, NT), nonvar(NT), !. % assume a node has only one type
can be used to express what we know about the problem.
Facts can_connect/3 then can be read like
a can connect to any number of b
a can connect to just 1 c
etc
Where we don't know the node type, a complex rule is needed, that infers the type of source node from the type of target node, and accounts for the counting constraint, something like
node_type(N, NT) :-
link(M, N),
type(M, MT),
can_connect(MT, NT, C),
aggregate(count, Y^(link(M, Y), type(Y, NT)), C).
?- forall(between(1,8,N), (node_type(N,T),writeln(N:T))).
1:a
2:b
3:b
4:c
5:d
6:d
7:a
8:b
true.
edit if your Prolog doesn't have library(aggregate), from where aggregate/3 has been loaded, you can try
node_type(N, NT) :-
link(M, N),
type(M, MT),
can_connect(MT, NT, C),
findall(t, (link(M, Y), type(Y, NT)), Ts), length(Ts, C).
edit first of all, the updated graph, marked with types where known:
my previous code worked only under very restricted assumptions. Here is something more general, that checks the constraints over the full graph (as was suggested by #false comment), with a 'generate and test' approach.
node_type(N, NT) :-
assoc_types(Typed),
maplist(check_connections(Typed), Typed),
memberchk(N:NT, Typed).
assoc_types(Typed) :-
findall(N, type(N, _), L),
maplist(typed, L, Typed).
typed(N, N:T) :- type(N, T), member(T, [a,b,c,d]).
check_connections(Graph, N:NT) :-
forall(link(N, M), (
memberchk(M:MT, Graph),
can_connect(NT, MT, C),
aggregate(count, X^(link(N, X), memberchk(X:MT, Graph)), C)
)).
now ?- node_type(4,X). fails...
min_member(-Min, +List)
True when Min is the smallest member in the standard order of terms. Fails if List is empty.
?- min_member(3, [1,2,X]).
X = 3.
The explanation is of course that variables come before all other terms in the standard order of terms, and unification is used. However, the reported solution feels somehow wrong.
How can it be justified? How should I interpret this solution?
EDIT:
One way to prevent min_member/2 from succeeding with this solution is to change the standard library (SWI-Prolog) implementation as follows:
xmin_member(Min, [H|T]) :-
xmin_member_(T, H, Min).
xmin_member_([], Min0, Min) :-
( var(Min0), nonvar(Min)
-> fail
; Min = Min0
).
xmin_member_([H|T], Min0, Min) :-
( H #>= Min0
-> xmin_member_(T, Min0, Min)
; xmin_member_(T, H, Min)
).
The rationale behind failing instead of throwing an instantiation error (what #mat suggests in his answer, if I understood correctly) is that this is a clear question:
"Is 3 the minimum member of [1,2,X], when X is a free variable?"
and the answer to this is (to me at least) a clear "No", rather than "I can't really tell."
This is the same class of behavior as sort/2:
?- sort([A,B,C], [3,1,2]).
A = 3,
B = 1,
C = 2.
And the same tricks apply:
?- min_member(3, [1,2,A,B]).
A = 3.
?- var(B), min_member(3, [1,2,A,B]).
B = 3.
The actual source of confusion is a common problem with general Prolog code. There is no clean, generally accepted classification of the kind of purity or impurity of a Prolog predicate. In a manual, and similarly in the standard, pure and impure built-ins are happily mixed together. For this reason, things are often confused, and talking about what should be the case and what not, often leads to unfruitful discussions.
How can it be justified? How should I interpret this solution?
First, look at the "mode declaration" or "mode indicator":
min_member(-Min, +List)
In the SWI documentation, this describes the way how a programmer shall use this predicate. Thus, the first argument should be uninstantiated (and probably also unaliased within the goal), the second argument should be instantiated to a list of some sort. For all other uses you are on your own. The system assumes that you are able to check that for yourself. Are you really able to do so? I, for my part, have quite some difficulties with this. ISO has a different system which also originates in DEC10.
Further, the implementation tries to be "reasonable" for unspecified cases. In particular, it tries to be steadfast in the first argument. So the minimum is first computed independently of the value of Min. Then, the resulting value is unified with Min. This robustness against misuses comes often at a price. In this case, min_member/2 always has to visit the entire list. No matter if this is useful or not. Consider
?- length(L, 1000000), maplist(=(1),L), min_member(2, L).
Clearly, 2 is not the minimum of L. This could be detected by considering the first element of the list only. Due to the generality of the definition, the entire list has to be visited.
This way of handling output unification is similarly handled in the standard. You can spot those cases when the (otherwise) declarative description (which is the first of a built-in) explicitly refers to unification, like
8.5.4 copy_term/2
8.5.4.1 Description
copy_term(Term_1, Term_2) is true iff Term_2 unifies
with a term T which is a renamed copy (7.1.6.2) of
Term_1.
or
8.4.3 sort/2
8.4.3.1 Description
sort(List, Sorted) is true iff Sorted unifies with
the sorted list of List (7.1.6.5).
Here are those arguments (in brackets) of built-ins that can only be understood as being output arguments. Note that there are many more which effectively are output arguments, but that do not need the process of unification after some operation. Think of 8.5.2 arg/3 (3) or 8.2.1 (=)/2 (2) or (1).
8.5.4 1 copy_term/2 (2),
8.4.2 compare/3 (1),
8.4.3 sort/2 (2),
8.4.4 keysort/2 (2),
8.10.1 findall/3 (3),
8.10.2 bagof/3 (3),
8.10.3 setof/3 (3).
So much for your direct questions, there are some more fundamental problems behind:
Term order
Historically, "standard" term order1 has been defined to permit the definition of setof/3 and sort/2 about 1982. (Prior to it, as in 1978, it was not mentioned in the DEC10 manual user's guide.)
From 1982 on, term order was frequently (erm, ab-) used to implement other orders, particularly, because DEC10 did not offer higher-order predicates directly. call/N was to be invented two years later (1984) ; but needed some more decades to be generally accepted. It is for this reason that Prolog programmers have a somewhat nonchalant attitude towards sorting. Often they intend to sort terms of a certain kind, but prefer to use sort/2 for this purpose — without any additional error checking. A further reason for this was the excellent performance of sort/2 beating various "efficient" libraries in other programming languages decades later (I believe STL had a bug to this end, too). Also the complete magic in the code - I remember one variable was named Omniumgatherum - did not invite copying and modifying the code.
Term order has two problems: variables (which can be further instantiated to invalidate the current ordering) and infinite terms. Both are handled in current implementations without producing an error, but with still undefined results. Yet, programmers assume that everything will work out. Ideally, there would be comparison predicates that produce
instantiation errors for unclear cases like this suggestion. And another error for incomparable infinite terms.
Both SICStus and SWI have min_member/2, but only SICStus has min_member/3 with an additional argument to specify the order employed. So the goal
?- min_member(=<, M, Ms).
behaves more to your expectations, but only for numbers (plus arithmetic expressions).
Footnotes:
1 I quote standard, in standard term order, for this notion existed since about 1982 whereas the standard was published 1995.
Clearly min_member/2 is not a true relation:
?- min_member(X, [X,0]), X = 1.
X = 1.
yet, after simply exchanging the two goals by (highly desirable) commutativity of conjunction, we get:
?- X = 1, min_member(X, [X,0]).
false.
This is clearly quite bad, as you correctly observe.
Constraints are a declarative solution for such problems. In the case of integers, finite domain constraints are a completely declarative solution for such problems.
Without constraints, it is best to throw an instantiation error when we know too little to give a sound answer.
This is a common property of many (all?) predicates that depend on the standard order of terms, while the order between two terms can change after unification. Baseline is the conjunction below, which cannot be reverted either:
?- X #< 2, X = 3.
X = 3.
Most predicates using a -Value annotation for an argument say that pred(Value) is the same
as pred(Var), Value = Var. Here is another example:
?- sort([2,X], [3,2]).
X = 3.
These predicates only represent clean relations if the input is ground. It is too much to demand the input to be ground though because they can be meaningfully used with variables, as long as the user is aware that s/he should not further instantiate any of the ordered terms. In that sense, I disagree with #mat. I do agree that constraints can surely make some of these relations sound.
This is how min_member/2 is implemented:
min_member(Min, [H|T]) :-
min_member_(T, H, Min).
min_member_([], Min, Min).
min_member_([H|T], Min0, Min) :-
( H #>= Min0
-> min_member_(T, Min0, Min)
; min_member_(T, H, Min)
).
So it seems that min_member/2 actually tries to unify Min (the first argument) with the smallest element in List in the standard order of terms.
I hope I am not off-topic with this third answer. I did not edit one of the previous two as I think it's a totally different idea. I was wondering if this undesired behaviour:
?- min_member(X, [A, B]), A = 3, B = 2.
X = A, A = 3,
B = 2.
can be avoided if some conditions can be postponed for the moment when A and B get instantiated.
promise_relation(Rel_2, X, Y):-
call(Rel_2, X, Y),
when(ground(X), call(Rel_2, X, Y)),
when(ground(Y), call(Rel_2, X, Y)).
min_member_1(Min, Lst):-
member(Min, Lst),
maplist(promise_relation(#=<, Min), Lst).
What I want from min_member_1(?Min, ?Lst) is to expresses a relation that says Min will always be lower (in the standard order of terms) than any of the elements in Lst.
?- min_member_1(X, L), L = [_,2,3,4], X = 1.
X = 1,
L = [1, 2, 3, 4] .
If variables get instantiated at a later time, the order in which they get bound becomes important as a comparison between a free variable and an instantiated one might be made.
?- min_member_1(X, [A,B,C]), B is 3, C is 4, A is 1.
X = A, A = 1,
B = 3,
C = 4 ;
false.
?- min_member_1(X, [A,B,C]), A is 1, B is 3, C is 4.
false.
But this can be avoided by unifying all of them in the same goal:
?- min_member_1(X, [A,B,C]), [A, B, C] = [1, 3, 4].
X = A, A = 1,
B = 3,
C = 4 ;
false.
Versions
If the comparisons are intended only for instantiated variables, promise_relation/3 can be changed to check the relation only when both variables get instantiated:
promise_relation(Rel_2, X, Y):-
when((ground(X), ground(Y)), call(Rel_2, X, Y)).
A simple test:
?- L = [_, _, _, _], min_member_1(X, L), L = [3,4,1,2].
L = [3, 4, 1, 2],
X = 1 ;
false.
! Edits were made to improve the initial post thanks to false's comments and suggestions.
I have an observation regarding your xmin_member implementation. It fails on this query:
?- xmin_member(1, [X, 2, 3]).
false.
I tried to include the case when the list might include free variables. So, I came up with this:
ymin_member(Min, Lst):-
member(Min, Lst),
maplist(#=<(Min), Lst).
Of course it's worse in terms of efficiency, but it works on that case:
?- ymin_member(1, [X, 2, 3]).
X = 1 ;
false.
?- ymin_member(X, [X, 2, 3]).
true ;
X = 2 ;
false.
I am trying to find the number of occurrences of X in the List L
For eg :-
occurrences(a, [b, a, b, c, a, d, a], N ).
N =3
My code not working .Here is my code.
occ(K,L,N) :- N1=0, occ1(K,L,N1,N).
occ1(K,[],N1,N) :- N=N1.
occ1(K,L,N1,N) :-
L=[X|L1],
( K=X -> N1 is N1+1, occ1(K,L1,N1,N) ; occ1(K,L1,N1,N) ).
Can anybody tell me what's wrong in the code.
While the answer given by #Kay is spot-on as far as fixing the bug is concerned, it completely circumvents a much bigger issue: The code of occ1/4 is logically impure.
This may not appear very important to you right now,
but using impure code has several negative consequences:
Impure code cannot be read declaratively, only procedurally.
Debugging impure code is often tedious and pain-staking.
Impure predicates are less "relational" than their pure counterparts.
Logical impurity hampers code re-use.
Because it is non-monotone, impure code is prone to lead to logically unsound answers, particularly when working with non-ground terms.
To show that these problems persisted in your code after having been "fixed" as suggested #Kay, let us consider the "corrected" code and some queries. First, here's the corrected code:
occ(K,L,N) :- N1=0, occ1(K,L,N1,N).
occ1(_,[],N1,N) :- N=N1.
occ1(K,L,N1,N) :-
L=[X|L1],
( K=X -> N2 is N1+1, occ1(K,L1,N2,N) ; occ1(K,L1,N1,N) ).
Here's the query you gave in your question:
?- occ(a,[b,a,b,c,a,d,a],N).
N = 3 ;
false.
Okay! What if we write the query differently?
?- A=a,B=b,C=c,D=d, occ(a,[B,A,B,C,A,D,A],N).
A = a, B = b, C = c, D = d, N = 3 ;
false.
Okay! What if we reorder goals? Logical conjunction should be commutative...
?- occ(a,[B,A,B,C,A,D,A],N), A=a,B=b,C=c,D=d.
false.
Fail! It seemed that occ1/4 is fine, but now we get an answer that is logically unsound.
This can be avoided by using logically pure code:
Look at the pure and monotone code I gave in my answer to the related question "Prolog - count repititions in list (sic)".
The problem is
N1 is N1+1
Variables cannot be "overwritten" in Prolog. You need to just a new variable, e.g.
N2 is N1+1, occ1(K,L1,N2,N)
To your question "Can we replace a particular list element. If yes, what is the syntax?":
You can only build a new list:
replace(_, _, [], []).
replace(Old, New, [H0|T0], [H1|T1]) :-
(H0 = Old -> H1 = New; H1 = H0),
replace(Old, New, T0, T1).