How are anonymous variables interpreted in Prolog? - prolog

A quick and simple question regarding what role anonymous variables play in the resolution of a Prolog query given a set of program rule. So, the way I understand how the simplest form of SLD resolution works, an SLD tree is constructed by taking some term from a set of goal terms (based on a selection rule, e.g. FIRST) and going through all the program rules to see which rule's left hand side (the consequent, so to say) can be unified with the term at hand. The way to unify two given terms is to take a difference set of two terms and see if variables can be substituted for terms such that the difference vanishes, you do this by successively taking the leftmost single difference and checking if, out of the two sets constituting the difference, one is a variable not appearing in the other and composing your current substitution with one mapping the variable onto the term (starting with the empty or identity substitution).
Now, when anonymous variables (_) come into play, I suspect the trick in doing it correctly and efficiently lies in changing the way you determine the leftmost difference between two terms to ignore a pair of terms whenever one of them is an anonymous variable. The obviously correct way to do it would be to rename every instance of _ in the goal and the program set to a new variable name and solve using those.
How is it actually done? Is my idea sufficient, or is there more to it than that? (Also, would appreciate it very much if something is missing in the way I understand SLD resolution works, barring negation, call, capsuling, arithmetic predicates and the more complicated stuff.)

Prolog anonymous variables don't play a role in SLD resolution or in term unification but do play a practical role in Prolog code and Prolog queries. A fundamental aspect of anonymous variables is that each occurrence of an anonymous variable is a different variable. Consider the following query:
| ?- a(_, _) = a(1, 2).
yes
The unification would have failed if the two anonymous variables were the same variable. Now consider the query:
| ?- a(X, _) = a(1, 2).
X = 1
yes
Variable bindings are only reported for variables that are not anonymous variables. This allows using an anonymous variable everytime we are not interested in any bindings for a variable.
Anonymous variables also simplify writing predicate definitions where they similarly act as "don't care" variables. Consider as an example the usual definition of the member/2 predicate:
member(Element, [Element| _]).
member(Element, [_| List]) :-
member(Element, List).
In the first clause, we don't care about the list tail. In the second clause, we don't care about the list head. By using anonymous variables, we can ignore those sub-terms and avoid the compiler complaining about variables that would be used once in a clause.
Update
Note that all different variables in a query get unique internal variable references, not to be confused with variable names as typed by the user. The variables names are only used by the top-level interpreter to report bindings for successful queries. The inference mechanism used to prove a query use the variable (internal) references. The following query, using the ISO Prolog standard read_term/2 predicate with standard options may help:
| ?- read_term(Term, [variable_names(Names), variables(Variables)]).
a(X, _, Y, _).
Names = ['X'=A,'Y'=B]
Term = a(A,C,B,D)
Variables = [A,C,B,D]
yes
In the term read, there are four distinct variables but only two of them have (user provided) names.

This is a comment in an answer because a comment can not format this as needed.
Using SWI-Prolog
?- trace,(_=_).
Call: (11) _1834=_1836 ? creep
Exit: (11) _1834=_1834 ? creep
true.
Each anonymous variable is created as a separate variable. When the unification takes place the one variable is unified with the other variable.

Related

What is Prolog saying about an uninstantiated variable?

Say we were to execute the following, and SWI Prolog responds:
?- write(X).
_13074
true.
What is _13074? Is this an address? Is this an ID of some sort? I notice that we'll get a different value each time. Furthermore, why does Prolog report true.? Is this Prolog saying that anything can be unified with X? Why do these appear in the order they do?
If we were to unify X with, say, 1, then the result is different.
?- X = 1, write(X).
1
X = 1.
What is _13074?
The simplest answer is that it represents an uninstantiated variable.
To be more precise from the Prolog Standard
anonymous variable: A variable (represented in a term or Prolog text by _) which differs from every other variable (and anonymous
variable) (see 6.1.2, 6.4.3)
instantiated: A variable is instantiated with respect to substitution if application of the substitution yields an atomic term or a compound term.
A term is instantiated if any of its variables are instantiated.
uninstantiated: A variable is uninstantiated when it is not instantiated.
variable: An object which may be instantiated to a term during execution.
named variable: A variable which is not an anonymous variable (see 6.1.2, 6.4.3)
So obviously all of that is self referential but in short by the standard there are anonymous variables _ and named variables, E.g. X, Y, Ls.
Note that the standard does not say what is the difference between _ and variables with numbers in the suffix, E.g. _13074. Those are implementation specific.
The standard does note for writing a term,
7.10.5 Writing a term
When a term Term is output using write-term/3 (8.14.2) the action which is taken is defined by the rules below:
a) If Term is a variable, a character sequence representing that variable is output. The sequence begins with _ (underscore) and the remaining characters are implementation dependent. The same character sequence is used for each occurrence of a particular variable in Term. A different character sequence is used for each distinct variable in Term.
Since you specifically mention SWI-Prolog there are other variable caveats to be aware of:
named singleton variables AKA auxiliary variables
Named singletons start with a double underscore (__) or a single underscore followed by an uppercase letter, E.g., __var or _Var.
Attribute variables - provide a technique for extending the Prolog unification algorithm Holzbaur, 1992 by hooking the binding of attributed variables. There is no consensus in the Prolog community on the exact definition and interface to attributed variables. The SWI-Prolog interface is identical to the one realised by Bart Demoen for hProlog Demoen, 2002. This interface is simple and available on all Prolog systems that can run the Leuven CHR system (see chapter 9 and the Leuven CHR page).
Global variables - are associations between names (atoms) and terms.
I don't plan to dive deeper into variables as one has to start looking at SWI-Prolog C level source code to really get a more accurate understanding, (ref). I also don't plan to add more from the standard as one would eventually have to reproduce the entire standard here just to cover all of the references.
For more definitions from the Prolog standard see: Is this Prolog terminology correct? (fact, rule, procedure, predicate, ...) The answer is a community wiki so most users can add to it and the OP does not get the points, so upvote all you want.
Is this an address?
No
Sometimes you will also see logic variable used but I don't plan to expand on that here, however for the record SWI-Prolog is NOT based on WAM it is based on A Portable Prolog Compiler.
See above 7.10.5 Writing a term
Is this an ID of some sort?
I would not argue with that in a causal conversation about SWI-Prolog but there is enough problems with that simple analogy to split hairs and start a discussion/debate, E.g. can a blob be assigned to a variable? What is numbervars?
See above 7.10.5 Writing a term
I notice that we'll get a different value each time.
The Prolog standard uses the word occurrence.
See above 7.10.5 Writing a term
why does Prolog report true.?
Prolog is a logic language which executes queries (goal) that result in either true or false or the instantiated values of variables, however there can be side effects such as writing to a file, throwing exceptions, etc.
The Prolog standard states
A.2.1.1 The General Resolution Algorithm
The general resolution of a goal G of a database P is defined by the following non-deterministic algorithm:
a) Start with the initial goal G which is an ordered conjunction of
predications.
b) If G is the singleton true then stop (success).
c) Choose a predication A in G (predication-choice)
d) If A is true, delete it, and proceed to step (b).
e) If no renamed clause in P has a head which unifies with A then stop (failure).
f) Choose a freshly renamed clause in P whose head H unifies with A (clause-choice) where σ = MGU(H, A) and B is the body of the clause,
g) Replace in G the predication A by the body B, flatten and apply the substitution σ.
h) Proceed to step (b).
Also see:
Resolution
MGU
Is this Prolog saying that anything can be unified with X?
For very simple Prolog implementations (ref) then the question would make sense. In the real world and even more so with SWI-Prolog were the rubber meets the road I would have to say not in all cases.
For most Prolog code syntactic unification is what is driving what is happening. See: A.2.1.1 The General Resolution Algorithm above. However if you start to think about things like blobs, attributes, threads, exceptions, and so on then you really have to look at what is a variable, even the kind of variable and what that variable can do , E.g.
?- X is true.
ERROR: Arithmetic: `true/0' is not a function
ERROR: In:
ERROR: [10] _4608 is true
ERROR: [9] toplevel_call(user:user: ...) at c:/program files/swipl/boot/toplevel.pl:1117
?- trie_new(A_trie).
A_trie = <trie>(0000000006E71DB0).
?- write(X).
_13074
true.
Why do these appear in the order they do?
write(X). is the goal entered by the user.
The goal is executed which in this case has the side effect of writing to the current output stream stream, E.g. current_output/1, the variable X which for SWI-Prolog for this occurrence of X is uninstantiated and is displayed as _13074.
The logic query ends and the result of the query being logical is either true or false. Since the query executed successfully the result is true.
If we were to unify X with, say, 1, then the result is different.
?- X = 1, write(X).
1
X = 1.
I will presume you are asking why there is no true at the end.
IIRC with SWI-Prolog, if the query starts with a variable and then the query succeeds with the variable being instantiated that will be reported and no true or false will then appear, E.g.
?- X = 1.
X = 1.
?- current_prolog_flag(double_quotes,V).
V = string.
?- X = 1, Y = 1, X = Y.
X = Y, Y = 1.
If however the query succeeds and no variable was instantiated then the query will report true E.g.
?- 1 = 1.
true.
?- current_prolog_flag(double_quotes,string).
true.
If the the query fails the query will report false E.g.
?- X = 1, Y = 2, X = Y.
false.
?- current_prolog_flag(quotes,String).
false.
I suspect this much information will now have you asking for more details but I won't go much deeper than this as SO is not a place to give a lecture condensed into an answer. I will try to clarify what is written but if it needs a lot more detail expect to be requested to post a new separate question.
I know the info from the standard presented here leaves lots of lose ends. If you really want the details from the standard then purchase the standard as those of us who have have done. I know it is not cheap but for questions like this it is the source of the answers.

Why should a rule be standardized in Backward Chaining before looking for substitutions?

I understood most of the Backward Chaining algorithm (for first-order logic), but not what Standardize-Variables(rule) is for. Below is the pseudo-code of the algorithm:
function FOL-BC-Ask(KB, query) returns a generator of substitutions
return FOL-BC-Or(KB, query, {})
function FOL-BC-Or(KB, goal, θ) returns a substitution
for each rule in Fetch-Rules-For-Goal(KB, goal) do
(lhs ⇒ rhs) ← Standardize-Variables(rule)
for each θ' in FOL-BC-And(KB, lhs, Unify(rhs, goal, θ)) do
yield θ'
function FOL-BC-And(KB, goals, θ) returns a substitution
if θ = failure then return
else if Length(goals) = 0 then yield θ
else
first, rest ← First(goals), Rest(goals)
for each θ' in FOL-BC-Or(KB, Subst(θ, first), θ) do
for each θ'' in FOL-BC-And(KB, rest, θ') do
yield θ''
I'm studying on the book Artificial Intelligence - A Modern Approach and the code comes from there. The book simply says
FOL-BC-Or works by fetching all clauses that might unify with the goal, standardizing the variables in the clause to be brand-new variables, and then ...
I do understand this, but I do not understand why it needs to be done, or what would happen without it.
I hope someone can explain this. Thank you.
The reason for standardizing variables apart is rather mundane: scope. A variable is "local" to its clause, so when it appears in multiple clauses, it really should be treated as a different variable in each clause. Standardizing apart makes sure this is made clear by using different names in each clause.
Let me explain in more detail. In a normalized first-order logic theory, each clause is implicitly universally quantified. If I have a theory with two clauses
happy(X)
happy(X) or not friends(X,Y),
it means the same as
for all X: happy(X)
for all X : for all Y : happy(X) or not friends(X,Y)
You can think of "for all X" as a sort of "declaration" of X (in the programming sense of "declaration"), so each of these variables is, so to speak, "local" to the clause, in the same sense that a local variable in programming is local to its scope. It is pure coincidence that X is used in both clauses, and in fact we can rename them at will within each clause and obtain perfectly equivalent theories such as
for all U: happy(U)
for all V : for all W : happy(V) or not friends(V,W)
or even
for all X: happy(X)
for all Y : for all X : happy(Y) or not friends(Y,X)
Standardizing apart comes into play because if we try to unify these two clauses, there will be two variables with the the same name X even though they do not necessarily refer to the same entities. If we try to unify the two clauses above without standardizing apart first, we will unify X and Y and end up with
happy(X) or not friends(X,X)
which implies that both arguments of "friends" are the same even though that would not be implied if we simply renamed the variables. Unifying the same perfectly equivalent two clauses using U, V, W names results in
happy(U) or not friends(U, W)
where now the two arguments of "friends" are not required to be the same.
The fact that we obtained different results from unifying perfectly equivalent theories shows us that something must be incorrect. And indeed what is incorrect here is unifying two clauses that use a variable with the same name (X) even though they are not really the same variable and could be equivalently renamed to something else.
David Einsentat's comment is correct that failing to standardize apart is incorrect as it does not provide the most general unifier, because it may provide an unifier that has spurious constraints such as the equality we saw above, preventing it from being as general as it should.
Standardizing apart solves this problem by renaming the variables to "brand-new ones", meaning variables that do no appear anywhere else and which therefore do not pose the risk of colliding in this way and introducing a false equality based on purely arbitrary name choices.

Free Variable in Prolog

Can anyone explain the concept of free variables in Prolog. Is it similar to anonymous variables ? Or is there a difference. Also could be great if an example is given to explain.
tl;dr:
free is a notion to distinguish universally bound (free in clause notation) from existentially bound variables in setof/3, bagof/3, etc. - some people use free to mean "currently not instantiated" and some use it to denote an output argument that's meant to be instantiated by the predicate but that's not how the standard uses it.
long version:
I will quote the Prolog standard on the definition:
7.1.1.4 Free variables set of a term
The free variables set, FVt of a term T with respect to a
term v is a set of variables defined as the set difference
of the variable set (7.1.1.1) of T and BV where BV is a
set of variables defined as the union of the variable set of
v and the existential variables set (7.1.1.3) of T.
where they explicitly note:
The concept of a free variables set is required when defining
bagof/3 (8.10.2) and setof/3 (8.10.3).
Perhaps as a background: in logic, a free variable is one that is not bound by a quantifier (e.g. x is bound and y is free in ∀x p(x,y) ). A (pure) prolog clause head(X) :- goal1(X), goal2(X). can be read as the logical formula ∀X goal1(X) ∧ goal2(X) → head(X). In practice, as long as we use fresh variables whenever we try to unify a goal with a clause, we can just disregard the universal quantifiers. So for our purposes we can treat X in the clause above as free.
This is all and well until meta-predicates come in: say we are interested in the set of first elements in a list of tuples:
?- setof(X, member(X-Y, [1-2, 2-2, 1-3]), Xs).
Y = 2,
Xs = [1, 2] ;
Y = 3,
Xs = [1].
But we get two solutions: the ones where Y=2 and those where Y=3. What I'd actually want to say is: there exists some Y such that X-Y is a member of the list. The Prolog notation for this pattern is to write Var^Term:
?- setof(X, Y^member(X-Y, [1-2, 2-2, 1-3]), Xs).
Xs = [1, 2].
In the first example, both X and Y are free, in the second example X is free and Y is bound.
If we write this as a formula we get setof(X, ∃Y member(X-Y, [1-2, 2-3, 1-3]), Xs) which is not a first order formula anymore (there is an equivalent first order one but this is where the name meta predicate comes in). Now the problem is that the Var^Term notation is purely syntactical - internally there is only one type of variable. But when we describe the behaviour of setof and friends we need to distinguish between free and existentially bound variables. So unless you are using metapredicates, all of your variables can be considered as free (1).
The Learning Prolog link provided by #Reema Q Khan is a bit fuzzy in its use of free. Just looking at the syntax, X is free in X=5, X is 2 + 3. But when we run this query, as soon as we get to the second goal, X has been instantiated to 5 so we are actually running the query 5 is 2 + 3 (2). What is meant in that context is that we expect is/3 to unify its first argument (often called "output" argument). To make sure this always succeeds we would pass a variable here (even though it's perfectly fine not to do it). The text tries to describe this expectation as "free variable" (3).
(1) ok, formally, anything that looks like Var^Term considers Var existentially bound but without meta-predicates this doesn't matter.
(2) I believe there is a clash in notation that some texts use "X is bound to 5" here, which might increase the confusion.
(3) What the should say is that they expect that the argument has not been instantiated yet but even that does not capture the semantics correctly - Paulo Moura already gave the initial ground example 5 is 3 + 2.
Maybe this can help. (If I have prepared it, I might as well post it! Still hard to read, needs simplification.)
In fact, you need to distinguish whether you talk about the syntax of the program or whether you talk about the runtime state of the program.
The word "variable" takes on slightly different meanings in both cases. In common usage, one does not make a distinction, and the understanding this fluent usage provides is good enough. But for beginners, this may be a hurdle.
In logic, the word "variable" has the meaning of "a symbol selected from the set of variable symbols", and it stands for the possibly infinite set of terms it may take on while fulfilling any constraints given by the logical formulae it participates in. This is not the "variable" used in reasoning about an actual programs.
Free Variable:
"is" is a build-in arithmetic evaluator in Prolog. "X is E" requires X to be free variable and E to be arithmetic expression that is possible to evaluate. E can contain variables but these variables has to be bound to numbers, e.g., "X=5, Y is 2*X" is correct Prolog goal.
More Explanation:
http://kti.ms.mff.cuni.cz/~bartak/prolog.old/learning/LearningProlog11.html
Anonymous Variable:
The name of every anonymous variable is _ .
More Explanation:
https://dobrev.com/help/tut/The_anonymous_variable.html#:~:text=The%20anonymous%20variable%20is%20an,of%20_denotes%20a%20distinct%20variable%20.

What does +,+ mode in Prolog mean?

So am being told a specific predicate has to work in +,+ mode. What does that mean in Prolog?
When one wants to give information on a predicate in prolog, those conventions are often used :
arity : predicate/3 means predicate takes 3 arguments.
parameters : predicate(+Element, +List, -Result) means that Element and List should not be free variables and that Result should be a free variable for the predicate to work properly. ? is used when it can be both, # is mentionned on the above answer but is not really used as much (at least in swi-pl doc) and means that the input will not be bound during the call.
so telling that somepredicate works in +, + mode is a shortcut for telling that :
% somepredicate/2 : somepredicate(+Input1, +Input2)
In order to give you a definite answer you need to tell us more than just +,+. For predicates whose arguments are only atoms, things are well defined: p(+,+) means that the predicate should only be called with both arguments being atoms.
But if we have, say lists, things are more complex. There are two meanings in that case. Consider member/2 which succeeds for member(2,[1,2,3]).
Are the queries member(2,[X]) or member(2,[X|Xs]) now +,+ or not?
The direct interpretation which is also used in ISO Prolog says that (quoting 8.1.2.2 Mode of an argument, from ISO/IEC 13211-1:1995):
+ the argument shall be instantiated,
In that sense both queries above are +,+.
However, there is another interpretation which implicitly assumes that we have access to the definition of the predicate. This interpretation stems from the mode declarations of DEC-10 Prolog, one of the first Prolog systems. So lets look at member/2:
member(X, [X|_]).
member(X, [_|Xs]) :-
member(X, Xs).
A mode member(+,+) would now mean that when executing a goal, this mode will hold for all subgoals. That is, member(2,[X]) would be +,+ whereas member(2,[X|Xs]) is not
because of its subgoal member(2,Xs).
People do confuse these notions quite frequently. So when you are talking about lists or other compound terms, it helps to ask what is meant.
For more on modes see this answer.
It means that the arguments to the predicate will both be input arguments (though not pure input).
This page has a succint description of all of Prolog's call modes.

prolog cut off in method

I have a question I would like to ask you something about a code snippet:
insert_pq(State, [], [State]) :- !.
insert_pq(State, [H|Tail], [State, H|Tail]) :-
precedes(State, H).
insert_pq(State, [H|T], [H|Tnew]) :-
insert_pq(State, T, Tnew).
precedes(X, Y) :- X < Y. % < needs to be defined depending on problem
the function quite clearly adds an item to a priority queue. The problem I have is the cut off operator in the first line. Presumably whenever the call reaches this line of code this is the only possible solution to the query, and the function calls would simply unwind (or is it wind up?), there would be no need to back track and search for another solution to the query.
So this cut off here is superfluous. Am I correct in my deduction?
Yes, any half-decent Prolog compiler will notice that there is no other clause where the second argument is an empty list.
It would be more useful at the end of the second clause, though I'd rather combine the second and the third clause and use a local cut (precedes(...) -> ... ; ...).
The particular technique that the compiler users to eliminate candidate predicates for matching is called argument indexing. Different prolog implementations could potentially index different numbers of arguments by default.
So if you're worried about whether an argument is being indexed or not, you should check how many arguments the prolog you're using indexes. According to the SWI reference manual it only indexes the first argument by default. So in your case the cut is actually not redundant. You can however explicitly stipulate which arguments should be indexed using the predicates index/1 and hash/1 which are linked to in the above link.
Or you could just reorder the arguments, or you could just keep the cut.
Yes, you are correct. Even if the compiler isn't half-decent (which SWI Prolog certainly is), the worst it can do is match the second and third clauses, which will fail immediately.
However, if the second clause matches, the third does as well. Is this the intended behaviour?

Resources