What is Prolog saying about an uninstantiated variable?

What is Prolog saying about an uninstantiated variable? - prolog

Say we were to execute the following, and SWI Prolog responds:
?- write(X).
_13074
true.
What is _13074? Is this an address? Is this an ID of some sort? I notice that we'll get a different value each time. Furthermore, why does Prolog report true.? Is this Prolog saying that anything can be unified with X? Why do these appear in the order they do?
If we were to unify X with, say, 1, then the result is different.
?- X = 1, write(X).
1
X = 1.

What is _13074?
The simplest answer is that it represents an uninstantiated variable.
To be more precise from the Prolog Standard
anonymous variable: A variable (represented in a term or Prolog text by _) which differs from every other variable (and anonymous
variable) (see 6.1.2, 6.4.3)
instantiated: A variable is instantiated with respect to substitution if application of the substitution yields an atomic term or a compound term.
A term is instantiated if any of its variables are instantiated.
uninstantiated: A variable is uninstantiated when it is not instantiated.
variable: An object which may be instantiated to a term during execution.
named variable: A variable which is not an anonymous variable (see 6.1.2, 6.4.3)
So obviously all of that is self referential but in short by the standard there are anonymous variables _ and named variables, E.g. X, Y, Ls.
Note that the standard does not say what is the difference between _ and variables with numbers in the suffix, E.g. _13074. Those are implementation specific.
The standard does note for writing a term,
7.10.5 Writing a term
When a term Term is output using write-term/3 (8.14.2) the action which is taken is defined by the rules below:
a) If Term is a variable, a character sequence representing that variable is output. The sequence begins with _ (underscore) and the remaining characters are implementation dependent. The same character sequence is used for each occurrence of a particular variable in Term. A different character sequence is used for each distinct variable in Term.
Since you specifically mention SWI-Prolog there are other variable caveats to be aware of:
named singleton variables AKA auxiliary variables
Named singletons start with a double underscore (__) or a single underscore followed by an uppercase letter, E.g., __var or _Var.
Attribute variables - provide a technique for extending the Prolog unification algorithm Holzbaur, 1992 by hooking the binding of attributed variables. There is no consensus in the Prolog community on the exact definition and interface to attributed variables. The SWI-Prolog interface is identical to the one realised by Bart Demoen for hProlog Demoen, 2002. This interface is simple and available on all Prolog systems that can run the Leuven CHR system (see chapter 9 and the Leuven CHR page).
Global variables - are associations between names (atoms) and terms.
I don't plan to dive deeper into variables as one has to start looking at SWI-Prolog C level source code to really get a more accurate understanding, (ref). I also don't plan to add more from the standard as one would eventually have to reproduce the entire standard here just to cover all of the references.
For more definitions from the Prolog standard see: Is this Prolog terminology correct? (fact, rule, procedure, predicate, ...) The answer is a community wiki so most users can add to it and the OP does not get the points, so upvote all you want.
Is this an address?
No
Sometimes you will also see logic variable used but I don't plan to expand on that here, however for the record SWI-Prolog is NOT based on WAM it is based on A Portable Prolog Compiler.
See above 7.10.5 Writing a term
Is this an ID of some sort?
I would not argue with that in a causal conversation about SWI-Prolog but there is enough problems with that simple analogy to split hairs and start a discussion/debate, E.g. can a blob be assigned to a variable? What is numbervars?
See above 7.10.5 Writing a term
I notice that we'll get a different value each time.
The Prolog standard uses the word occurrence.
See above 7.10.5 Writing a term
why does Prolog report true.?
Prolog is a logic language which executes queries (goal) that result in either true or false or the instantiated values of variables, however there can be side effects such as writing to a file, throwing exceptions, etc.
The Prolog standard states
A.2.1.1 The General Resolution Algorithm
The general resolution of a goal G of a database P is defined by the following non-deterministic algorithm:
a) Start with the initial goal G which is an ordered conjunction of
predications.
b) If G is the singleton true then stop (success).
c) Choose a predication A in G (predication-choice)
d) If A is true, delete it, and proceed to step (b).
e) If no renamed clause in P has a head which unifies with A then stop (failure).
f) Choose a freshly renamed clause in P whose head H unifies with A (clause-choice) where σ = MGU(H, A) and B is the body of the clause,
g) Replace in G the predication A by the body B, flatten and apply the substitution σ.
h) Proceed to step (b).
Also see:
Resolution
MGU
Is this Prolog saying that anything can be unified with X?
For very simple Prolog implementations (ref) then the question would make sense. In the real world and even more so with SWI-Prolog were the rubber meets the road I would have to say not in all cases.
For most Prolog code syntactic unification is what is driving what is happening. See: A.2.1.1 The General Resolution Algorithm above. However if you start to think about things like blobs, attributes, threads, exceptions, and so on then you really have to look at what is a variable, even the kind of variable and what that variable can do , E.g.
?- X is true.
ERROR: Arithmetic: `true/0' is not a function
ERROR: In:
ERROR: [10] _4608 is true
ERROR: [9] toplevel_call(user:user: ...) at c:/program files/swipl/boot/toplevel.pl:1117
?- trie_new(A_trie).
A_trie = <trie>(0000000006E71DB0).
?- write(X).
_13074
true.
Why do these appear in the order they do?
write(X). is the goal entered by the user.
The goal is executed which in this case has the side effect of writing to the current output stream stream, E.g. current_output/1, the variable X which for SWI-Prolog for this occurrence of X is uninstantiated and is displayed as _13074.
The logic query ends and the result of the query being logical is either true or false. Since the query executed successfully the result is true.
If we were to unify X with, say, 1, then the result is different.
?- X = 1, write(X).
1
X = 1.
I will presume you are asking why there is no true at the end.
IIRC with SWI-Prolog, if the query starts with a variable and then the query succeeds with the variable being instantiated that will be reported and no true or false will then appear, E.g.
?- X = 1.
X = 1.
?- current_prolog_flag(double_quotes,V).
V = string.
?- X = 1, Y = 1, X = Y.
X = Y, Y = 1.
If however the query succeeds and no variable was instantiated then the query will report true E.g.
?- 1 = 1.
true.
?- current_prolog_flag(double_quotes,string).
true.
If the the query fails the query will report false E.g.
?- X = 1, Y = 2, X = Y.
false.
?- current_prolog_flag(quotes,String).
false.
I suspect this much information will now have you asking for more details but I won't go much deeper than this as SO is not a place to give a lecture condensed into an answer. I will try to clarify what is written but if it needs a lot more detail expect to be requested to post a new separate question.
I know the info from the standard presented here leaves lots of lose ends. If you really want the details from the standard then purchase the standard as those of us who have have done. I know it is not cheap but for questions like this it is the source of the answers.

Related

Why should a rule be standardized in Backward Chaining before looking for substitutions?

I understood most of the Backward Chaining algorithm (for first-order logic), but not what Standardize-Variables(rule) is for. Below is the pseudo-code of the algorithm:
function FOL-BC-Ask(KB, query) returns a generator of substitutions
return FOL-BC-Or(KB, query, {})
function FOL-BC-Or(KB, goal, θ) returns a substitution
for each rule in Fetch-Rules-For-Goal(KB, goal) do
(lhs ⇒ rhs) ← Standardize-Variables(rule)
for each θ' in FOL-BC-And(KB, lhs, Unify(rhs, goal, θ)) do
yield θ'
function FOL-BC-And(KB, goals, θ) returns a substitution
if θ = failure then return
else if Length(goals) = 0 then yield θ
else
first, rest ← First(goals), Rest(goals)
for each θ' in FOL-BC-Or(KB, Subst(θ, first), θ) do
for each θ'' in FOL-BC-And(KB, rest, θ') do
yield θ''
I'm studying on the book Artificial Intelligence - A Modern Approach and the code comes from there. The book simply says
FOL-BC-Or works by fetching all clauses that might unify with the goal, standardizing the variables in the clause to be brand-new variables, and then ...
I do understand this, but I do not understand why it needs to be done, or what would happen without it.
I hope someone can explain this. Thank you.

The reason for standardizing variables apart is rather mundane: scope. A variable is "local" to its clause, so when it appears in multiple clauses, it really should be treated as a different variable in each clause. Standardizing apart makes sure this is made clear by using different names in each clause.
Let me explain in more detail. In a normalized first-order logic theory, each clause is implicitly universally quantified. If I have a theory with two clauses
happy(X)
happy(X) or not friends(X,Y),
it means the same as
for all X: happy(X)
for all X : for all Y : happy(X) or not friends(X,Y)
You can think of "for all X" as a sort of "declaration" of X (in the programming sense of "declaration"), so each of these variables is, so to speak, "local" to the clause, in the same sense that a local variable in programming is local to its scope. It is pure coincidence that X is used in both clauses, and in fact we can rename them at will within each clause and obtain perfectly equivalent theories such as
for all U: happy(U)
for all V : for all W : happy(V) or not friends(V,W)
or even
for all X: happy(X)
for all Y : for all X : happy(Y) or not friends(Y,X)
Standardizing apart comes into play because if we try to unify these two clauses, there will be two variables with the the same name X even though they do not necessarily refer to the same entities. If we try to unify the two clauses above without standardizing apart first, we will unify X and Y and end up with
happy(X) or not friends(X,X)
which implies that both arguments of "friends" are the same even though that would not be implied if we simply renamed the variables. Unifying the same perfectly equivalent two clauses using U, V, W names results in
happy(U) or not friends(U, W)
where now the two arguments of "friends" are not required to be the same.
The fact that we obtained different results from unifying perfectly equivalent theories shows us that something must be incorrect. And indeed what is incorrect here is unifying two clauses that use a variable with the same name (X) even though they are not really the same variable and could be equivalently renamed to something else.
David Einsentat's comment is correct that failing to standardize apart is incorrect as it does not provide the most general unifier, because it may provide an unifier that has spurious constraints such as the equality we saw above, preventing it from being as general as it should.
Standardizing apart solves this problem by renaming the variables to "brand-new ones", meaning variables that do no appear anywhere else and which therefore do not pose the risk of colliding in this way and introducing a false equality based on purely arbitrary name choices.

Free Variable in Prolog

Can anyone explain the concept of free variables in Prolog. Is it similar to anonymous variables ? Or is there a difference. Also could be great if an example is given to explain.

tl;dr:
free is a notion to distinguish universally bound (free in clause notation) from existentially bound variables in setof/3, bagof/3, etc. - some people use free to mean "currently not instantiated" and some use it to denote an output argument that's meant to be instantiated by the predicate but that's not how the standard uses it.
long version:
I will quote the Prolog standard on the definition:
7.1.1.4 Free variables set of a term
The free variables set, FVt of a term T with respect to a
term v is a set of variables defined as the set difference
of the variable set (7.1.1.1) of T and BV where BV is a
set of variables defined as the union of the variable set of
v and the existential variables set (7.1.1.3) of T.
where they explicitly note:
The concept of a free variables set is required when defining
bagof/3 (8.10.2) and setof/3 (8.10.3).
Perhaps as a background: in logic, a free variable is one that is not bound by a quantifier (e.g. x is bound and y is free in ∀x p(x,y) ). A (pure) prolog clause head(X) :- goal1(X), goal2(X). can be read as the logical formula ∀X goal1(X) ∧ goal2(X) → head(X). In practice, as long as we use fresh variables whenever we try to unify a goal with a clause, we can just disregard the universal quantifiers. So for our purposes we can treat X in the clause above as free.
This is all and well until meta-predicates come in: say we are interested in the set of first elements in a list of tuples:
?- setof(X, member(X-Y, [1-2, 2-2, 1-3]), Xs).
Y = 2,
Xs = [1, 2] ;
Y = 3,
Xs = [1].
But we get two solutions: the ones where Y=2 and those where Y=3. What I'd actually want to say is: there exists some Y such that X-Y is a member of the list. The Prolog notation for this pattern is to write Var^Term:
?- setof(X, Y^member(X-Y, [1-2, 2-2, 1-3]), Xs).
Xs = [1, 2].
In the first example, both X and Y are free, in the second example X is free and Y is bound.
If we write this as a formula we get setof(X, ∃Y member(X-Y, [1-2, 2-3, 1-3]), Xs) which is not a first order formula anymore (there is an equivalent first order one but this is where the name meta predicate comes in). Now the problem is that the Var^Term notation is purely syntactical - internally there is only one type of variable. But when we describe the behaviour of setof and friends we need to distinguish between free and existentially bound variables. So unless you are using metapredicates, all of your variables can be considered as free (1).
The Learning Prolog link provided by #Reema Q Khan is a bit fuzzy in its use of free. Just looking at the syntax, X is free in X=5, X is 2 + 3. But when we run this query, as soon as we get to the second goal, X has been instantiated to 5 so we are actually running the query 5 is 2 + 3 (2). What is meant in that context is that we expect is/3 to unify its first argument (often called "output" argument). To make sure this always succeeds we would pass a variable here (even though it's perfectly fine not to do it). The text tries to describe this expectation as "free variable" (3).
(1) ok, formally, anything that looks like Var^Term considers Var existentially bound but without meta-predicates this doesn't matter.
(2) I believe there is a clash in notation that some texts use "X is bound to 5" here, which might increase the confusion.
(3) What the should say is that they expect that the argument has not been instantiated yet but even that does not capture the semantics correctly - Paulo Moura already gave the initial ground example 5 is 3 + 2.

Maybe this can help. (If I have prepared it, I might as well post it! Still hard to read, needs simplification.)
In fact, you need to distinguish whether you talk about the syntax of the program or whether you talk about the runtime state of the program.
The word "variable" takes on slightly different meanings in both cases. In common usage, one does not make a distinction, and the understanding this fluent usage provides is good enough. But for beginners, this may be a hurdle.
In logic, the word "variable" has the meaning of "a symbol selected from the set of variable symbols", and it stands for the possibly infinite set of terms it may take on while fulfilling any constraints given by the logical formulae it participates in. This is not the "variable" used in reasoning about an actual programs.

Free Variable:
"is" is a build-in arithmetic evaluator in Prolog. "X is E" requires X to be free variable and E to be arithmetic expression that is possible to evaluate. E can contain variables but these variables has to be bound to numbers, e.g., "X=5, Y is 2*X" is correct Prolog goal.
More Explanation:
http://kti.ms.mff.cuni.cz/~bartak/prolog.old/learning/LearningProlog11.html
Anonymous Variable:
The name of every anonymous variable is _ .
More Explanation:
https://dobrev.com/help/tut/The_anonymous_variable.html#:~:text=The%20anonymous%20variable%20is%20an,of%20_denotes%20a%20distinct%20variable%20.

How are anonymous variables interpreted in Prolog?

A quick and simple question regarding what role anonymous variables play in the resolution of a Prolog query given a set of program rule. So, the way I understand how the simplest form of SLD resolution works, an SLD tree is constructed by taking some term from a set of goal terms (based on a selection rule, e.g. FIRST) and going through all the program rules to see which rule's left hand side (the consequent, so to say) can be unified with the term at hand. The way to unify two given terms is to take a difference set of two terms and see if variables can be substituted for terms such that the difference vanishes, you do this by successively taking the leftmost single difference and checking if, out of the two sets constituting the difference, one is a variable not appearing in the other and composing your current substitution with one mapping the variable onto the term (starting with the empty or identity substitution).
Now, when anonymous variables (_) come into play, I suspect the trick in doing it correctly and efficiently lies in changing the way you determine the leftmost difference between two terms to ignore a pair of terms whenever one of them is an anonymous variable. The obviously correct way to do it would be to rename every instance of _ in the goal and the program set to a new variable name and solve using those.
How is it actually done? Is my idea sufficient, or is there more to it than that? (Also, would appreciate it very much if something is missing in the way I understand SLD resolution works, barring negation, call, capsuling, arithmetic predicates and the more complicated stuff.)

Prolog anonymous variables don't play a role in SLD resolution or in term unification but do play a practical role in Prolog code and Prolog queries. A fundamental aspect of anonymous variables is that each occurrence of an anonymous variable is a different variable. Consider the following query:
| ?- a(_, _) = a(1, 2).
yes
The unification would have failed if the two anonymous variables were the same variable. Now consider the query:
| ?- a(X, _) = a(1, 2).
X = 1
yes
Variable bindings are only reported for variables that are not anonymous variables. This allows using an anonymous variable everytime we are not interested in any bindings for a variable.
Anonymous variables also simplify writing predicate definitions where they similarly act as "don't care" variables. Consider as an example the usual definition of the member/2 predicate:
member(Element, [Element| _]).
member(Element, [_| List]) :-
member(Element, List).
In the first clause, we don't care about the list tail. In the second clause, we don't care about the list head. By using anonymous variables, we can ignore those sub-terms and avoid the compiler complaining about variables that would be used once in a clause.
Update
Note that all different variables in a query get unique internal variable references, not to be confused with variable names as typed by the user. The variables names are only used by the top-level interpreter to report bindings for successful queries. The inference mechanism used to prove a query use the variable (internal) references. The following query, using the ISO Prolog standard read_term/2 predicate with standard options may help:
| ?- read_term(Term, [variable_names(Names), variables(Variables)]).
a(X, _, Y, _).
Names = ['X'=A,'Y'=B]
Term = a(A,C,B,D)
Variables = [A,C,B,D]
yes
In the term read, there are four distinct variables but only two of them have (user provided) names.

This is a comment in an answer because a comment can not format this as needed.
Using SWI-Prolog
?- trace,(_=_).
Call: (11) _1834=_1836 ? creep
Exit: (11) _1834=_1834 ? creep
true.
Each anonymous variable is created as a separate variable. When the unification takes place the one variable is unified with the other variable.

Prolog single-variable query returns an error. Why?

I've created a simple Prolog program (using GNU Prolog v1.4.4) with a single fact:
sunny.
When I run the following query:
sunny.
I get:
yes
As I'd expect. When I run this query:
X.
I get:
uncaught exception: error(instantiation_error,top_level/0)
when I expected to get:
X = sunny
Anyone know why?

Prolog is based on first-order logic but X is a second order logic query (the variable stands for a rule head / fact, not only a term): you ask "which predicates can be derived?" or in other words "which formulas are true?". Second order logic is so expressive that we lose many nice properties of first-order logic (*). That's why a second order variable must be sufficiently instantiated to know which rule to try at the time it is called (that's what the error message means). For instance the queries
?- X=member(A,[1,2,3]), X.
and
?- member(A,[1,2,3]).
still allow Prolog to try the definition of the member predicate (in fact the two definitions are equivalent) but
?- X, X=member(A,[1,2,3]).
will throw an exception because at the time X should be derived, we don't know that it's supposed to become the predicate member(A,[1,2,3]).
Your case is much simpler though: you can wrap sunny as a term into a predicate such that Prolog knows which rules to try. The facts
weather(sunny).
weather(rainy).
define the predicate weather such that now we only have a first-order variable as argument in our query:
?- weather(X).
X = sunny ;
X = rainy.
Now that we are talking about the term level, everything works as you expected.
(*) Although the problem of finding out if a formula is valid is undecidable in both cases, in first order logic at least all true formulas can be eventually derived but if a formula is false, the search might not terminate (i.e. first-order logic is semi-decidable). For second order logic there are formulas that can neither be proved not disproved. What is worse is that we cannot even tell if a second-order formula belongs to this category.

Determining successor in prolog using recursion

I'm trying (failing) to understand an exercise where I'm given the following clauses;
pterm(null).
pterm(f0(X)) :- pterm(X).
pterm(f1(X)) :- pterm(X).
They represent a number in binary, eg. f0(null) is equivalent to 0, f1(null) is equivalent to 1, etc.
The objective is to define a predicate over pterm such that one is the successor of the other when true. It seems like a relatively simple exercise but I'm struggling to get my head around it.
Here is the code I've written so far;
incr(X,Y) :- pterm(f0(X)), pterm(f1(Y)).
incr(X,Y) :- pterm(f0(f1(X))), pterm(f1(f1(Y))).
Having tested this I know it's very much incorrect. How might I go about inspecting the top level arguments of each pterm?
I've made minimal progress in the last 4 hours so any hints/help would be appreciated.

1)
I'll start with the "how to inspect" question, as I think it will be the most useful. If you're using swi-prolog with xpce, run the guitracer:
?- consult('pterm'). % my input file
% pterm compiled 0.00 sec, 5 clauses
true.
?- guitracer.
% The graphical front-end will be used for subsequent tracing
true.
?- trace. % debugs step by step
true.
[trace] ?- pterm(f0(f1(null))). % an example query to trace
true.
A graphical interface will come up. Press the down arrow to unify things step by step. What's going on should make sense fairly quickly.
(use notrace. and nodebug. appropriately to exit trace and debug modes afterwards).
2) You seem to misunderstand how predicates work. A predicate is a logical statement, i.e. it will always return either true or false. You can think of them as classical boolean functions of the type "iseven(X)" (testing if X is even) or "ismemberof(A,B)" (testing if A is a member of B) etc. When you have a rule like "pred1 :- pred2, pred3." this is similar to saying "pred1 will return true if pred2 returns true, and pred3 returns true (otherwise pred1 returns false)".
When your predicates are called using constants, checking its truth value is a matter of checking your facts database to see if that predicate with those constants can be satisfied. But when you call using variables, prolog goes through a wild goose chase, trying to unify that variable with all the allowable stuff it can link it to, to see if it can try to make that predicate true. If it can't, it gives up and says it's false.
A predicate like incr(X,Y) is still something that needs to return true or false, but, if by design, this only becomes true when Y is the incremented version of X, where X is expected to be given at query time as input, then we have tricked prolog into making a "function" that is given X as input, and "returns" Y as output, because prolog will try to find an appropriate Y that makes the predicate true.
Therefore, with your example, incr(X,Y) :- pterm(f0(X)), pterm(f1(Y)). makes no sense, because you're telling it that incr(X,Y) will return true for any X,Y, as long as prolog can use X to find in the fact database any pterm(f0(X)) that will lead to a known fact, and also use Y to find a pterm(f1(Y)) term. You haven't made Y dependent on X in any way. This query will succeed for X = null, and Y = null, for instance.
Your first clause should be something like this.
incr(X,Y) :- X = pterm(f0(Z)), Y = pterm(f1(Z)).
where = performs unification. I.e. "find a value for Z such that X is pterm(f0(Z)), and for the same value of Z it also applies that Y = pterm(f1(Z))."
In fact, this could be more concisely rewritten as a fact:
incr( pterm(f0(Z)), pterm(f1(Z)) ).
3)
Your second clause can be adapted similarly. However, I'm not sure if this is correct in terms of the logic of what you're trying to achieve (i.e. binary arithmetic). But I may have misunderstood the problem you're trying to solve.
My assumption is that if you have (0)111, then the successor should be 1000, not 1111. For this, I would guess you need to create a predicate that recursively checks if the incrementation of the digits below the currently processed one results in a 'carried' digit.
(since the actual logic is what your assignment is about, I won't offer a solution here. but hope this helps get you into grips with what's going on. feel free to have a go at the recursive version and ask another question based on that code!)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio