Reporting *why* a query failed in Prolog in a systematic way - prolog

I'm looking for an approach, pattern, or built-in feature in Prolog that I can use to return why a set of predicates failed, at least as far as the predicates in the database are concerned. I'm trying to be able to say more than "That is false" when a user poses a query in a system.
For example, let's say I have two predicates. blue/1 is true if something is blue, and dog/1 is true if something is a dog:
blue(X) :- ...
dog(X) :- ...
If I pose the following query to Prolog and foo is a dog, but not blue, Prolog would normally just return "false":
? blue(foo), dog(foo)
false.
What I want is to find out why the conjunction of predicates was not true, even if it is an out of band call such as:
? getReasonForFailure(X)
X = not(blue(foo))
I'm OK if the predicates have to be written in a certain way, I'm just looking for any approaches people have used.
The way I've done this to date, with some success, is by writing the predicates in a stylized way and using some helper predicates to find out the reason after the fact. For example:
blue(X) :-
recordFailureReason(not(blue(X))),
isBlue(X).
And then implementing recordFailureReason/1 such that it always remembers the "reason" that happened deepest in the stack. If a query fails, whatever failure happened the deepest is recorded as the "best" reason for failure. That heuristic works surprisingly well for many cases, but does require careful building of the predicates to work well.
Any ideas? I'm willing to look outside of Prolog if there are predicate logic systems designed for this kind of analysis.

As long as you remain within the pure monotonic subset of Prolog, you may consider generalizations as explanations. To take your example, the following generalizations might be thinkable depending on your precise definition of blue/1 and dog/1.
?- blue(foo), * dog(foo).
false.
In this generalization, the entire goal dog(foo) was removed. The prefix * is actually a predicate defined like :- op(950, fy, *). *(_).
Informally, above can be read as: Not only this query fails, but even this generalized query fails. There is no blue foo at all (provided there is none). But maybe there is a blue foo, but no blue dog at all...
?- blue(_X/*foo*/), dog(_X/*foo*/).
false.
Now we have generalized the program by replacing foo with the new variable _X. In this manner the sharing between the two goals is retained.
There are more such generalizations possible like introducing dif/2.
This technique can be both manually and automatically applied. For more, there is a collection of example sessions. Also see Declarative program development in Prolog with GUPU

Some thoughts:
Why did the logic program fail: The answer to "why" is of course "because there is no variable assignment that fulfills the constraints given by the Prolog program".
This is evidently rather unhelpful, but it is exactly the case of the "blue dog": there are no such thing (at least in the problem you model).
In fact the only acceptable answer to the blue dog problem is obtained when the system goes into full theorem-proving mode and outputs:
blue(X) <=> ~dog(X)
or maybe just
dog(X) => ~blue(X)
or maybe just
blue(X) => ~dog(X)
depending on assumptions. "There is no evidence of blue dogs". Which is true, as that's what the program states. So a "why" in this question is a demand to rewrite the program...
There may not be a good answer: "Why is there no x such that x² < 0" is ill-posed and may have as answer "just because" or "because you are restricting yourself to the reals" or "because that 0 in the equation is just wrong" ... so it depends very much.
To make a "why" more helpful, you will have to qualify this "why" somehow. which may be done by structuring the program and extending the query so that additional information collecting during proof tree construction is bubbling up, but you will have to decide beforehand what information that is:
query(Sought, [Info1, Info2, Info3])
And this query will always succeed (for query/2, "success" no longer means "success in finding a solution to the modeled problem" but "success in finishing the computation"),
Variable Sought will be the reified answer of the actual query you want answered, i.e. one of the atoms true or false (and maybe unknown if you have had enough with two-valued logic) and Info1, Info2, Info3 will be additional details to help you answer a why something something in case Sought is false.
Note that much of the time, the desire to ask "why" is down to the mix-up between the two distinct failures: "failure in finding a solution to the modeled problem" and "failure in finishing the computation". For example, you want to apply maplist/3 to two lists and expect this to work but erroneously the two lists are of different length: You will get false - but it will be a false from computation (in this case, due to a bug), not a false from modeling. Being heavy-handed with assertion/1 may help here, but this is ugly in its own way.
In fact, compare with imperative or functional languages w/o logic programming parts: In the event of failure (maybe an exception?), what would be a corresponding "why"? It is unclear.
Addendum
This is a great question but the more I reflect on it, the more I think it can only be answer in a task-specific way: You must structure your logic program to be why-able, and you must decide what kind of information why should actually return. It will be something task-specific: something about missing information, "if only this or that were true" indications, where "this or that" are chosen from a dedicates set of predicates. This is of course expected, as there is no general way to make imperative or functional programs explain their results (or lack thereof) either.
I have looked a bit for papers on this (including IEEE Xplore and ACM Library), and have just found:
Reasoning about Explanations for Negative Query Answers in DL-Lite which is actually for Description Logics and uses abductive reasoning.
WhyNot: Debugging Failed Queries in Large Knowledge Bases which discusses a tool for Cyc.
I also took a random look at the documentation for Flora-2 but they basically seem to say "use the debugger". But debugging is just debugging, not explaining.
There must be more.

Related

Why is Prolog's failure by negation not considered logical negation?

In many Prolog guides the following code is used to illustrate "negation by failure" in Prolog.
not(Goal) :- call(Goal), !, fail.
not(Goal).
However, those same tutorials and texts warn that this is not "logical negation".
Question: What is the difference?
I have tried to read those texts further, but they don't elaborate on the difference.
I like #TesselatingHeckler's answer because it puts the finger on the heart of the matter. You might still be wondering, what that means for Prolog in more concrete terms. Consider a simple predicate definition:
p(something).
On ground terms, we get the expected answers to our queries:
?- p(something).
true.
?- \+ p(something).
false.
?- p(nothing).
false.
?- \+ p(nothing).
true.
The problems start, when variables and substitution come into play:
?- \+ p(X).
false.
p(X) is not always false because p(something) is true. So far so good. Let's use equality to express substitution and check if we can derive \+ p(nothing) that way:
?- X = nothing, \+ p(X).
X = nothing.
In logic, the order of goals does not matter. But when we want to derive a reordered version, it fails:
?- \+ p(X), X = nothing.
false.
The difference to X = nothing, \+ p(X) is that when we reach the negation there, we have already unified X such that Prolog tries to derive \+p(nothing) which we know is true. But in the other order the first goal is the more general \+ p(X) which we saw was false, letting the whole query fail.
This should certainly not happen - in the worst case we would expect non-termination but never failure instead of success.
As a consequence, we cannot rely on our logical interpretation of a clause anymore but have to take Prolog's execution strategy into account as soon as negation is involved.
Logical claim: "There is a black swan".
Prolog claim: "I found a black swan".
That's a strong claim.
Logical negation: "There isn't a black swan".
Prolog negation: "I haven't found a black swan".
Not such a strong a claim; the logical version has no room for black swans, the Prolog version does have room: bugs in the code, poor quality code not searching everywhere, finite resource limits to searching the entire universe down to swan size areas.
The logical negation doesn't need anyone to look anywhere, the claim stands alone separate from any proof or disproof. The Prolog logic is tangled up in what Prolog can and cannot prove using the code you write.
There are a couple of reasons why,
Insufficient instantiation
A goal not(Goal_0) will fail, iff Goal0 succeeds at the point in time when this not/1 is executed. Thus, its meaning depends on the very instantiations that happen to be present when this goal is executed. Changing the order of goals may thus change the outcome of not/1. So conjunction is not commutative.
Sometimes this problem can be solved by reformulating the actual query.
Another way to prevent incorrect answers is to check if the goal is sufficiently instantiated, by checking that say ground(Goal_0) is true producing an instantiation error otherwise. The downside of this approach is that quite too often instantiation errors are produced and people do not like them.
And even another way is to delay the execution of Goal_0 appropriately. The techniques to improve the granularity of this approach are called constructive negation. You find quite some publications about it but they have not found their way into general Prolog libraries. One reason is that such programs are particularly hard to debug when many delayed goals are present.
Things get even worse when combining Prolog's negation with constraints. Think of X#>Y,Y#>X which does not have a solution but not/1 just sees its success (even if that success is conditional).
Semantic ambiguity
With general negation, Prolog's view that there exists exactly one minimal model no longer holds. This is not a problem as long as only stratified programs are considered. But there are many programs that are not stratified yet still correct, like a meta-interpreter that implements negation. In the general case there are several minimal models. Solving this goes far beyond Prolog.
When learning Prolog stick to the pure, monotonic part first. This part is much richer than many expect. And you need to master that part in any case.

Negated possibilities in Prolog

This is a somewhat silly example but I'm trying to keep the concept pretty basic for better understanding. Say I have the following unary relations:
person(steve).
person(joe).
fruit(apples).
fruit(pears).
fruit(mangos).
And the following binary relations:
eats(steve, apples).
eats(steve, pears).
eats(joe, mangos).
I know that querying eats(steve, F). will return all the fruit that steve eats (apples and pears). My problem is that I want to get all of the fruits that Steve doesn't eat. I know that this: \+eats(steve,F) will just return "no" because F can't be bound to an infinite number of possibilities, however I would like it to return mangos, as that's the only existing fruit possibility that steve doesn't eat. Is there a way to write this that would produce the desired result?
I tried this but no luck here either: \+eats(steve,F), fruit(F).
If a better title is appropriate for this question I would appreciate any input.
Prolog provides only a very crude form of negation, in fact, (\+)/1 means simply "not provable at this point in time of the execution". So you have to take into account the exact moment when (\+)/1 is executed. In your particular case, there is an easy way out:
fruit(F), \+eats(steve,F).
In the general case, however, this is far from being fixed easily. Think of \+ X = Y, see this answer.
Another issue is that negation, even if used properly, will introduce non-monotonic properties into your program: By adding further facts for eats/2 less might be deduced. So unless you really want this (as in this example where it does make sense), avoid the construct.

Prolog: How to tell if a predicate is deterministic or not

So from what I understand about deterministic predicates:
Deterministic predicate = 1 solution
Non-deterministic predicate = multiple solutions
Are there any type of rules as to how you can detect if the predicate is one or the other? Like looking at the search tree, etc.
There is no clear, generally accepted consensus about these notions. However, they are usually based rather on the observed answers and not based on the number of solutions. In certain contexts the notions are very implementation related. Non-determinate may mean: leaves a choice point open. And sometimes determinate means: never even creates a choice point.
Answers vs. solutions
To see the difference, consider the goal length(L, 1). How many solutions does it have? L = [a] is one, L = [23] another... but all of these solutions are compactly represented with a single answer substitution: L = [_] which thus contains infinitely many solutions.
In any case, in all implementations I know of, length(L, 1) is a determinate goal.
Now consider the goal repeat which has exactly one solution, but infinitely many answers. This goal is considered non-determinate.
In case you are interested in constraints, things become even more evolved. In library(clpfd), the goal X #> Y, Y #> X has no solution, but still one answer. Combine this with repeat: infinitely many answers and no solution.
Further, the goal append(Xs, Ys, []) has exactly one solution and also exactly one answer, nevertheless it is considered non-determinate in many implementations, since in those implementations it leaves a choice point open.
In an ideal implementation, there would be no answers without solutions (except false), and there would be non-determinism only when there is more than one answer. But then, all of this is mostly undecidable in the general case.
So, whenever you are using these notions make sure on what level things are meant. Rather explicitly say: multiple answers, multiple solutions, leaves no (unnecessary) choice point open.
You need understand the difference between det, semidet and undet, it is more than just number of solutions.
Because there is no loop control operator in Prolog, looping (not recursion) is constructed as a 'sequence generating' predicate (undet) followed by the loop body. Also you can store solutions with some of findall-group predicates as a list and loop later with the member/2 predicate.
So, any piece of your program is either part of loop construction or part of usual flow. So, there is a difference in designing det and undet predicates almost in the intended usage. If you can work with a sequence you always do undet and comment it as so. There is a nice unit-test extension in swi-prolog which can check wheter your predicate always the same in mean of det/semidet/undet (semidet is for usage the same way as undet but as a head of 'if' construction).
So, the difference is pre-design, and this question should not be arised with already existing predicates. It is a good practice always comment the intended usage in a comment like.
% member(?El, ?List) is undet.
Deterministic: Always succeeds with a single answer that is always the same for the same input. Think a of a static list of three items, and you tell your function to return value one. You will get the same answer every time. Additionally, arithmetic functions. 1 + 1 = 2. X + Y = Z.
Semi-deterministic: Succeeds with a single answer that is always the same for the same input, but it can fail. Think of a function that takes a list of numbers, and you ask your function if some number exists in the list. It either does, or it doesn't, based on the contents of the list given and the number asked.
Non-deterministic: Succeeds with a single answer, but can exhibit different behaviors on different runs, even for the same input. Think any kind of math.random(min,max) function like random/3
In essence, this is entirely separate from the concept of choice points, as choice points are a function of Prolog. Where I think the Prolog confusion of these terms comes from is that Prolog can find a single answer, then go back and try for another solution, and you have to use the cut operator ! to tell it that you want to discard your choice points explicitly.
This is very useful to know when working with Prolog Unit Testing

left_of exercise from The Art Of Prolog

In the book we are asked to define the predicates left_of, right_of, above, and below using the following layout.
% bike camera
% pencil hourglass butterfly fish
left_of(pencil, hourglass).
left_of(hourglass, butterfly).
left_of(butterfly, fish).
above(bike, pencil).
above(camera, butterfly).
right_of(Obj1, Obj2) :-
left_of(Obj2, Obj1).
below(Obj1, Obj2) :-
above(Obj2, Obj1).
This seems to find correct solutions.
Later in the book we are asked to add a recursive rule for left_of. The only solution I could find is to use a different functor name: left_of2. So I've basically reimplemented the ancestor relationship.
left_of2(Obj1, Obj2) :-
left_of(Obj1, Obj2).
left_of2(Obj1, Obj2) :-
left_of(Obj1, X),
left_of2(X, Obj2).
In my attempts to reuse left_of, I can get all the correct solution but on the final redo, a stack overflow occurs. I'm guessing that's because I don't have a correct base case defined. Can this be coded using left_of for facts and a recursive procedure?
As mentioned in the comments, it is an unfortunate fact that in Prolog you must have separately named predicates to do this. If you don't, you'll wind up with something that looks like this:
left_of(X,Z) :- left_of(X,Y), left_of(Y,Z).
which gives you unbounded recursion twice. There's nothing wrong in principle with facts and predicates sharing the same name--in fact, it's pretty common for a base case rule to look like a fact. It's just that handling a transitive closure situation like this results in stack overflows unless one of the two steps is finite, and there's no other way to ensure that in Prolog than by naming them separately.
This is far from the only case in Prolog where you are compelled to break work down into separate predicates. Other commonly-occurring cases include computational loops with initializers or finalizers.
Conventionally one would wind up naming the predicate differently from the fact. For instance, directly_left_of for the facts and left_of for the predicate. Using the module system or Logtalk you can easily hide the "direct" version and encourage your users to use the transitive version. You can also make the intention more explicit without disallowing it by using an uncomfortable name for the hidden one, like left_of_.
In other languages, a function is a more opaque, larger sort of abstraction and there are facilities for hiding considerable work behind one. Prolog's predicates, by comparison, are "simpler," which used to bother me. Nowadays I'm glad that they're simpler because there's enough other stuff going on that I'm glad I don't also have to figure out variable-arity predicates or keyword arguments (though you can easily simulate both with lists, if you need to).

Prolog negation and logical negation

Assume we have the following program:
a(tom).
v(pat).
and the query (which returns false):
\+ a(X), v(X).
When tracing, I can see that X becomes instantiated to tom, the predicate a(tom) succeeds, therefore \+ a(tom) fails.
I have read in some tutorials that the not (\+) in Prolog is just a test and does not cause instantiation.
Could someone please clarify the above point for me? As I can see the instantiation.
I understand there are differences between not (negation as failure) and the logical negation. Could you refer a good article that explains in which cases they behave the same and when do they behave different?
Great question.
Short answer: you stumbled upon "floundering".
The problem is that the implementation of the operator \+ only works
when applied to a literal containing no variables, i.e., a ground
literal. It is not able to generate bindings for variables, but only
test whether subgoals succeed or fail. So to guarantee reasonable
answers to queries to programs containing negation, the negation
operator must be allowed to apply only to ground literals. If it is
applied to a nonground literal, the program is said to flounder.
link
If you invert the query
v(X), \+ a(X).
You'll get the right answer. Some implementations or meta interpreter detect floundering goals and delay them until all the variables are ground.
About your point 1), you see the instantiation inside the NAF tree. What happens there shouldn't affect variables that are outside (in this case in v(X)). Prolog often acts in the naive way to avoid inefficiencies. In theory it should just return an error instead of instantiating the variable.
2) This is my favourite article on the topic: Nonmonotonic Logic Programming.
WRT point 2, Wikipedia article seems a good starting point.
You already experienced that understanding NAF can be difficult. Part of this could be because (logical) negation it's inherently difficult to define even in simpler contest that predicate calculus (see for instance Russel's paradox), and part because the powerful variables of Prolog are domed to keep the actual counterexamples of failed if negated proofs. See if you can understand the actual library definition of forall/2 (please read the documentation, it's synthetic and interesting) that's the preferred way to run a failure driven loop:
%% forall(+Condition, +Action)
%
% True if Action if true for all variable bindings for which Condition
% if true.
forall(Cond, Action) :-
\+ (Cond, \+ Action).
I remember the first time I saw it, it looked like magic...
edit about a tutorial, I found, while 'spelunking' my links collection, a good site by J.R.Fisher. It's full of interesting stuff, just a pity it's a bit terse in explanations, requiring the student to answer itself with frequent execises. See the paragraph 2.5, devoted to negation by failure. I think you could also enjoy section 3. How Prolog Works

Resources