Difference between two variant implementations - prolog

Is there any logical difference between these two implementations of a variant predicate?
variant1(X,Y) :-
subsumes_term(X,Y),
subsumes_term(Y,X).
variant2(X_,Y_) :-
copy_term(X_,X),
copy_term(Y_,Y),
numbervars(X, 0, N),
numbervars(Y, 0, N),
X == Y.

Neither variant1/2 nor variant2/2 implement a test for being a syntactic variant. But for different reasons.
The goal variant1(f(X,Y),f(Y,X)) should succeed but fails. For some cases where the same variable appears on both sides, variant1/2 does not behave as expected. To fix this, use:
variant1a(X, Y) :-
copy_term(Y, YC),
subsumes_term(X, YC),
subsumes_term(YC, X).
The goal variant2(f('$VAR'(0),_),f(_,'$VAR'(0))) should fail but succeeds. Clearly, variant2/2 assumes that no '$VAR'/1 occur in its arguments.
ISO/IEC 13211-1:1995 defines variants as follows:
7.1.6.1 Variants of a term
Two terms are variants if there is a bijection s of the
variables of the former to the variables of the latter such that
the latter term results from replacing each variable X in the
former by Xs.
NOTES
1 For example, f(A, B, A) is a variant of f(X, Y, X),
g(A, B) is a variant of g(_, _), and P+Q is a variant of
P+Q.
2 The concept of a variant is required when defining bagof/3
(8.10.2) and setof/3 (8.10.3).
Note that the Xs above is not a variable name but rather (X)s. So s is here a bijection, which is a special case of a substitution.
Here, all examples refer to typical usages in bagof/3 and setof/3 where variables happen to be always disjoint, but the more subtle case is when there are common variables.
In logic programming, the usual definition is rather:
V is a variant of T iff there exists σ and θ such that
Vσ and T are identical
Tθ and V are identical
In other words, they are variants if both match each other. However, the notion of matching is pretty alien to Prolog programmers, that is, the notion of matching as used in formal logic. Here is a case which lets many Prolog programmers panic:
Consider f(X) and f(g(X)). Does f(g(X)) match f(X) or not? Many Prolog programmers will now shrug their shoulders and mumble something about the occurs-check. But this is entirely unrelated to the occurs-check. They match, yes, because
f(X){ X ↦ g(X) } is identical to f(g(X)).
Note that this substitution replaces all X and substitutes them for g(X). How can this happen? In fact, it cannot happen with Prolog's typical term representation as a graph in memory. In Prolog the node X is a real address somehow in memory, and you cannot do such an operation at all. But in logic things are on an entirely textual level. It's just like
sed 's/\<X\>/g(X)/g'
except that one can also replace variables simultaneously. Think of { X ↦ Y, Y ↦ X}. They have to be replaced at once, otherwise f(X,Y) would shrink into f(X,X) or f(Y,Y).
So this definition, while formally perfect, relies on notions that have no direct correspondence in Prolog systems.
Similar problems happen when one-sided unification is considered which is not matching, but the common case between unification and matching.
According to ISO/IEC 13211-1:1995 Cor.2:2012 (draft):
8.2.4 subsumes_term/2
This built-in predicate provides a test for syntactic one-sided unification.
8.2.4.1 Description
subsumes_term(General, Specific) is true iff there is a
substitution θ such
that
a) Generalθ
and Specificθ are identical, and
b) Specificθ and Specific
are identical.
Procedurally, subsumes_term(General, Specific) simply
succeeds or fails accordingly. There is no side effect or
unification.
For your definition of variant1/2, subsumes_term(f(X,Y),f(Y,X)) already fails.

Related

Does the Prolog symbol :- mean Implies, Entails or Proves?

In Prolog we can write very simple programs like this:
mammal(dog).
mammal(cat).
animal(X) :- mammal(X).
The last line uses the symbol :- which informally lets us read the final fact as: if X is a mammal then it is also an animal.
I am beginning to learn Prolog and trying to establish which of the following is meant by the symbol :-
Implies (⇒)
Entails (⊨)
Provable (⊢)
In addition, I am not clear on the difference between these three. I am trying to read threads like this one, but the discussion is at a level above my capability, https://math.stackexchange.com/questions/286077/implies-rightarrow-vs-entails-models-vs-provable-vdash.
My thinking:
Prolog works by pattern-matching symbols (unification and search) and so we might be tempted to say the symbol :- means 'syntactic entailment'. However this would only be true of queries that are proven to be true as a result of that syntactic process.
The symbol :- is used to create a database of facts, and therefore is semantic in nature. That means it could be one of Implies (⇒) or Entails (⊨) but I don't know which.
Neither. Or, rather if at all, then it's the implication. The other symbols are above, that is meta-language. The Mathematics Stack Exchange answers explain this quite nicely.
So why :- is not that much of an implication, consider:
p :- p.
In logic, both truth values make this a valid sentence. But in Prolog we stick to the minimal model. So p is false. Prolog uses a subset of predicate logic such that there actually is only one minimal model. And worse, Prolog's actual default execution strategy makes this an infinite loop.
Nevertheless, the most intuitive way to read LHS :- RHS. is to see it as a way to generate new knowledge. Provided RHS is true it follows that also LHS is true. This way one avoids all the paradoxa related to implication.
The direction right-to-left is a bit counter intuitive. This direction is motivated by Prolog's actual execution strategy (which goes left-to-right in this representation).
:- is usually read as if, so something like:
a :- b, c .
reads as
| a is true if b and c are true.
In formal logic, the above would be written as
| a ← b ∧ c
Or
| b and c imply a

Prolog - proof tree misses possibilities

I have the following Prolog Program:
p(f(X), Y) :- p(g(X), g(Y)).
p(g(X), Y) :- p(f(Y), f(X)).
p(f(a), g(b)).
The prolog proof tree has to be drawn for the predicate p(X, Y).
Question:
Why is Y matched to Y1/Y and not to Y/Y1 and why is Y used further on?
if I match a predicate (e.g. p(X, Y)), I get a new predicate (e.g. p(g(X1), g(Y))) - why contains p(g(X1), g(Y)) just one subtree? I mean, shouldn't it have 3 because the knowledgebase contains 3 statements - instead of just 1?
And why is at each layer of the tree matched with something like X2/X1 and so on ? and not with the predicate before ?
Shouldn't it be g(X1)/fX5, g(Y1)/Y5 ?
Note: Maybe it seems that I have never done a tutorial or something. But I did.. I appreciate every help.
To be honest, I have rarely seen a worse method to explain Prolog than what you show here.
Yes, I expect the author meant Y/Y1 instead of Y1/Y in both cases, otherwise the notation would be quite inconsistent.
As to your other questions: You are facing the usual problems that arise when taking such an extremely operational view of Prolog. The core issue is that this method doesn't scale: You do not have the mental capacity to carry this approach through. Don't take this personal: Humans in general are bad at keeping all details of an execution tree that grows exponentially in mind. This makes the whole approach extremely cumbersome and error-prone. For comparison, consider why human grandmasters have stopped competing against chess computers already many years ago. In this concrete case, note for example that the rightmost branch does not even arise in actual Prolog execution, but the graph wrongly suggests that it does!
Part of the problem here is a confusion in terminology: Please note that Prolog uses unification (not "matching", which is one-sided unification). When you unify a goal with a clause head and the unification succeeds, then you get bindings for variables. You continue with these bindings in place.
To make the whole approach remotely feasible, consider fragments of your program.
For example, suppose I only give you the following fact:
p(f(a), g(b)).
And you then query:
?- p(X, Y).
X = f(a),
Y = g(b).
This answers shows the bindings for X and Y. First make sure you understand this, and understand the difference between these bindings and a "new predicate" (which does not arise!).
Also, there are no "statements", but 3 clauses, which are logical alternatives.
Now, again to simplify the whole task, consider the following fragment of your program, in which I only look at the two rules:
p(f(X), Y) :- p(g(X), g(Y)).
p(g(X), Y) :- p(f(Y), f(X)).
Already with this program, we get:
?- p(X, Y).
nontermination
Adding a further pure clause cannot prevent this nontermination. Thus, I recommend you start with this reduced version of your program, and consider it in more depth.
From there, you can add the remaining fact again, and consider the differences.
Very good questions!
Why is Y matched to Y1/Y and not to Y/Y1 and why is Y used further on?
The naming here seems a little arbitrary in that they could have used Y/Y1 but then would need to use Y1 further on. In this case, they chose Y1/Y and use Y further on. Although the author of this expression tree was inconsistent in their convention, I wouldn't be too concerned about the naming as much as whether they follow the variable correctly down the tree.
if I match a predicate (e.g. p(X, Y)), I get a new predicate (e.g. p(g(X1), g(Y))) - why contains p(g(X1), g(Y)) just one subtree? I mean, should'nt it have 3 because the knowledgebase contains 3 statements - instead of just 1?
First a word on term versus predicate. A term is only a predicate in the context of Head :- Body in which case Head is a term that forms the head of a predicate clause. If a term is an argument to a predicate (for example, p(g(X1), g(Y)), the g(X1) and g(Y) are not predicates. They are just terms.
More specifically in this case, the term p(g(X1), g(Y)) only has one subtree because it only matches the head of one of the 3 predicate clauses which is the one with the head p(g(X), Y) (it matches with X = X1 and Y = g(Y)). The other two can't match since they're of the form p(f(...), ...) and the f(...) term cannot match the g(X1) term.
And why is at each layer of the tree matched with something like X2/X1 and so on ? and not with the predicate before ?
Shouldn't it be g(X1)/fX5, g(Y1)/Y5 ?
I'm not sure I'm following this question, but the principle to follow is that the tree is attempting to use the same variable name if it applies to the same variable in memory, whereas a different variable name (e.g., X1 versus X) is used if it's a different X. For example, if I have foo(X, Y) :- <some code>, bar(f(X), Y). and I have bar(X, Y) :- blah(X), ... then the X referred to in the bar predicate is different than the X referred to in the foo predicate. So we might say, in the call to foo(X, Y) we're calling bar(f(X), Y), or alternatively, bar(X1, Y) where X1 = f(X).

represent "there is X where a(X) is not true" and alike in prolog

let's say I have some predicate a/1, now how would I represent b which is true if a fails for some value ?
Unfortunately not doesn't help here , a definition like this :
b(X):- not(a(X)).
means "b is true if for any X a is false"(I want this to work when X isn't instantiated).
How would someone express this ? and what about the general case where more than one (not instantiated) variable exists ?
Is there more known about a/1?
Many Prolog predicates do have purely relational, sound negations.
For example, the unification X = Y can be cleanly stated not to hold by using the constraint dif/2: dif(X, Y) is true iff X and Y are different. It works correctly in all modes of use.
Similarly, CLP(FD) constraints like (#=)/2, (#>)/2 and others all have a completely sound logical negations. For example, you can say X #\= Y to state that X and Y are distinct integers.
A general way to express such issues is to reify the truth values of your predicates. For example, instead of a predicate a/1, consider a predicate a/2, where the second argument denotes whether the predicate holds in this case. You would call this as a(Arg, Truth), and your job is to implement it in such a way that Truth correctly reflects the truth value of a/1 for Arg. You can throw an instantiation_error in cases where you cannot make a sound decision. The preferable way is of course to declaratively express all possible cases using suitable constraints.
In some cases, constraint refication is already available out of the box. For example, you can negate all reifable CLP(FD) constraints using the predicate (#\)/1. Therefore, #\ (X #= Y) is the same as X #\= Y. Boolean constraints provide similar features.
As pointed before, there is no logical negation in Prolog, since there is no closed universe. Prolog negation is a negation-by-failure. This is, something is false whether it can not be prooved to be true.
In practique, not/1 (or '\+'/1) requieres a ground term to behalf as a logical negation.
You may find some experiments with logical negation (closed universes or domains) in some development environments (as far as I remember, Ciao Prolog has something about that). It requieres variables to be declared as having values at some finite domain.

Does Prolog use Eager Evaluation?

Because Prolog uses chronological backtracking(from the Prolog Wikipedia page) even after an answer is found(in this example where there can only be one solution), would this justify Prolog as using eager evaluation?
mother_child(trude, sally).
father_child(tom, sally).
father_child(tom, erica).
father_child(mike, tom).
sibling(X, Y) :- parent_child(Z, X), parent_child(Z, Y).
parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).
With the following output:
?- sibling(sally, erica).
true ;
false.
To summarize the discussion with #WillNess below, yes, Prolog is strict. However, Prolog's execution model and semantics are substantially different from the languages that are usually labelled strict or non-strict. For more about this, see below.
I'm not sure the question really applies to Prolog, because it doesn't really have the kind of implicit evaluation ordering that other languages have. Where this really comes into play in a language like Haskell, you might have an expression like:
f (g x) (h y)
In a strict language like ML, there is a defined evaluation order: g x will be evaluated, then h y, and f (g x) (h y) last. In a language like Haskell, g x and h y will only be evaluated as required ("non-strict" is more accurate than "lazy"). But in Prolog,
f(g(X), h(Y))
does not have the same meaning, because it isn't using a function notation. The query would be broken down into three parts, g(X, A), h(Y, B), and f(A,B,C), and those constituents can be placed in any order. The evaluation strategy is strict in the sense that what comes earlier in a sequence will be evaluated before what comes next, but it is non-strict in the sense that there is no requirement that variables be instantiated to ground terms before evaluation can proceed. Unification is perfectly content to complete without having given you values for every variable. I am bringing this up because you have to break down a complex, nested expression in another language into several expressions in Prolog.
Backtracking has nothing to do with it, as far as I can tell. I don't think backtracking to the nearest choice point and resuming from there precludes a non-strict evaluation method, it just happens that Prolog's is strict.
That Prolog pauses after giving each of the several correct answers to a problem has nothing to do with laziness; it is a part of its user interaction protocol. Each answer is calculated eagerly.
Sometimes there will be only one answer but Prolog doesn't know that in advance, so it waits for us to press ; to continue search, in hopes of finding another solution. Sometimes it is able to deduce it in advance and will just stop right away, but only sometimes.
update:
Prolog does no evaluation on its own. All terms are unevaluated, as if "quoted" in Lisp.
Prolog will unfold your predicate definitions as written and is perfectly happy to keep your data structures full of unevaluated uninstantiated holes, if so entailed by your predicate definitions.
Haskell does not need any values, a user does, when requesting an output.
Similarly, Prolog produces solutions one-by-one, as per the user requests.
Prolog can even be seen to be lazier than Haskell where all arithmetic is strict, i.e. immediate, whereas in Prolog you have to explicitly request the arithmetic evaluation, with is/2.
So perhaps the question is ill-posed. Prolog's operations model is just too different. There are no "results" nor "functions", for one; but viewed from another angle, everything is a result, and predicates are "multi"-functions.
As it stands, the question is not correct in what it states. Chronological backtracking does not mean that Prolog will necessarily backtrack "in an example where there can be only one solution".
Consider this:
foo(a, 1).
foo(b, 2).
foo(c, 3).
?- foo(b, X).
X = 2.
?- foo(X, 2).
X = b.
So this is an example that does have only one solution and Prolog recognizes that, and does not attempt to backtrack. There are cases in which you can implement a solution to a problem in a way that Prolog will not recognize that there is only one logical solution, but this is due to the implementation and is not inherent to Prolog's execution model.
You should read up on Prolog's execution model. From the Wikipedia article which you seem to cite, "Operationally, Prolog's execution strategy can be thought of as a generalization of function calls in other languages, one difference being that multiple clause heads can match a given call. In that case, [emphasis mine] the system creates a choice-point, unifies the goal with the clause head of the first alternative, and continues with the goals of that first alternative." Read Sterling and Shapiro's "The Art of Prolog" for a far more complete discussion of the subject.
from Wikipedia I got
In eager evaluation, an expression is evaluated as soon as it is bound to a variable.
Then I think there are 2 levels - at user level (our predicates) Prolog is not eager.
But it is at 'system' level, because variables are implemented as efficiently as possible.
Indeed, attributed variables are implemented to be lazy, and are rather 'orthogonal' to 'logic' Prolog variables.

Datalog Stratification

So I'm trying to understand how Datalog works and one of the differences between it and Prolog is that it has stratification limitations placed upon negation and recursion.
To quote Wikipedia:
If a predicate P is positively derived from a predicate Q (i.e., P is
the head of a rule, and Q occurs positively in the body of the same
rule), then the stratification number of P must be greater than or
equal to the stratification number of Q
If a predicate P is derived from a negated predicate Q (i.e., P is the
head of a rule, and Q occurs negatively in the body of the same rule),
then the stratification number of P must be greater than the
stratification number of Q,
So, going by this, the two following predicates do not result in a stratification error as they can simply be assigned the same stratification number. So these predicates are fine, despite the circular definition.
A(x) :- B(x)
B(x) :- A(x)
But contrast that with what happens if we have a definition which has some negation involved (Where ~ is negation)
A(x) :- ~ B(x)
B(x) :- ~ A(x)
Here a stratification is impossible. A(x,y) must have a stratification number greater than B(x,y), and B(x,y) must have a stratification number greater than A(x,y). My first thought was that this was not okay because this is a circular definition, but stratification is fine with circularity so long as the predicates are not negated. But why? Truth values are simply binary. It seems extremely arbitrary to treat formulas which have a negation symbol differently in this manner. What is this stratification trying to prevent in the second case which isn't in the first?
I think the problem with:
A(x) :- \+ B(x)
B(x) :- \+ A(x)
...is that it has ambiguous semantics. This program has two minimal models, namely, {A(x)} and {B(x)}, and is therefore not well-defined under the fixed point semantics (no fixed point) or under the model theoretic semantics (no unique minimal model).
In order to address this problem, stratified semantics for Datalog imposes restrictions on the syntax of Datalog programs such that, if a stratification exists for the program, then it will also have a unique, minimal model in both the fixed point and model theoretic semantics (and vice-versa, I believe).
You can find more on the precise details of stratified semantics for Datalog in the text "Foundations of Databases by Serge Abiteboul, Richard Hull, and Victor Vianu" which happens to be freely available online, with the relevant detail in Chapter 15. This excellent text also explains most of the other terms I've used above like model, fixed-point, etc. if you're stuck.

Resources