So I have the following working code in Prolog that produces the factorial of a given value of A:
factorial(0,1).
factorial(A,B) :- A>0, C is A-1, factorial(C,D), B is A*D.
I am looking for an explanation as to how this code works. I.e, what exactly happens when you ask the query: factorial(4, Answer).
Firstly,
factorial(0, 1).
I know the above is the "base case" of the recursive definition. What I am not sure of why/how it is the base case. My guess is that factorial(0, 1) inserts some structure containing (0, 1) as a member of "factorial". If so, what does the structure look like? I know if we say something like "rainy(seattle).", this means that Seattle is rainy. But "factorial(0, 1)"... 0, 1 is factorial? I realize it means factorial of 0 is 1, but how is this being used in the long run? (Writing this is helping me understand more as I go along, but I would like some feedback to make sure my thinking is correct.)
factorial(A,B) :- A>0, C is A-1, factorial(C,D), B is A*D.
Now, what exactly does the above code mean. How should I read it?
I am reading it as: factorial of (A, B) is true if A>0, C is A-1, factorial(C, D), B is A*D. That does not sound quite right to me... Is it?
"A > 0". So if A is equal to 0, what happens? It must not return at this point, or else the base case would never be used. So my guess is that A > 0 returns false, but the other functions are executed one last time. Did recursion stop because it reached the base case, or because A was not greater than 0? Or a combination of both? At what point is the base case used?
I guess that boils down to the question: What is the purpose of having both a base case and A > 0?
Sorry for the badly formed questions, thank you.
EDIT: In fact, I removed "A > 0" from the procedure and the code still works. So I guess my questions were not stupid at least. (And that code was taken from a tutorial.)
It is counterproductive to think of Prolog facts and rules in terms of data structures. When you write factorial(0, 1). you assert a fact to the Prolog interpreter that is assumed to be universally true. With this fact alone Prolog can answer questions of three types:
What is the factorial of 0? (i.e. factorial(0, X); the answer is X=1)
A factorial of what number is 1? (i.e. factorial(X,1); the answer is X=0)
Is it true that a factorial of 0 is 1? (i.e. factorial(0,1); the answer is "Yes")
As far as the rest of your Prolog program is concerned, only the first question is important. That is the question that the second clause of your factorial/2 rule will be asking at the end of evaluating a factorial.
The second rule uses comma operator, which is Prolog's way of saying "and". Your interpretation can be rewritten in terms of variables A and B like this:
B is a factorial of A when A>0, and C is set to A-1, and D is set to the factorial of C, and B is set to A times D
This rule covers all As above zero. The reference to factorial(C,D) will use the same rule again and again, until C arrives to zero. This is when this rule stops being applicable, so Prolog would grab the "base case" rule, and use 1 as its output. At this point, the chain of evaluating factorial(C, D) starts unwrapping, until it goes all the way to the initial invocation of the rule. This is when Prolog computes the final answer, and factorial/2 returns "Yes" and produces the desired output value.
In response to your edit, removing the A>0 is not dangerous only for getting the first result. Generally, you can ask Prolog to find you more results. This is when the factorial/2 with A>0 removed would fail spectacularly, because it would start going down the invocation chain of the second clause with negative numbers - a chain of calls that will end in numeric overflow or stack overflow, whichever comes first.
If you come from a procedural language background, the following C++ code might help. It mirrors pretty accurately the way the Prolog code executes (at least for the common case that A is given and B is uninstantiated):
bool fac(int a, int &b)
{
int c,d;
return
a==0 && (b=1,true)
||
a>0 && (c=a-1,true) && fac(c,d) && (b=a*d,true);
}
The Prolog comma operates like the sequential &&, and multiple clauses like a sequential ||.
My mental model for how prolog works is a tree traversal.
The facts and predicates in a prolog database form a forest of trees. When you ask the Prolog engine to evaluate a predicate:
?- factorial(6,N).
the Prolog engine looks for the tree rooted with the specified functor and arity (factorial/2 in this case). The Prolog engine then performs a depth-first traversal of that tree trying to find a solution using unification and pattern matching. Facts are evaluated as they are; For predicates, the right-hand side of the :- operator is evaluated, walking further into the tree, guided by the various logical operators.
Evaluation stops with the first successful evaluation of a leaf node in the tree, with the prolog engine remembering its state in the tree traversal. On backtracking, the tree traversal continues from where it left off. Execution is finally complete when the tree traversal is completed and there are no more paths to follow.
That's why Prolog is a descriptive language rather than an imperative language: you describe what constitutes truth (or falsity) and let the Prolog engine figure out how to get there.
Related
I'm trying to write a program to solve the towers of hanoi problem in Prolog. None of the posts here helped me so I decided to ask for myself. I have written the following code. It works well for 2 disks but goes into an infinite loop for 3.
hanoi(1,A,B,_,[(A,B)]).
hanoi(X,A,B,C,Y):-
hanoi(X2,A,C,B,Y1),
hanoi(1,A,B,_,[Y2]),
hanoi(X2,C,B,A,Y3),
append(Y1,[Y2|Y3],Y),
X2 is X-1.
It is called in the following way:
?- hanoi(3, a, b, c, Y).
a,b,c are the pegs. 3 is the number of disks and X is where we want the result.
I need to get the result in Y. I'm trying to recursively find the moves for X-1 discs from peg 1 to 3 using 2, 1 disc from peg 1 to 2, X-1 discs from peg 3 to 2 and append them. I can't understand what I'm doing wrong. Any help or guidance would be appreciated! Thanks!
The obvious problem -
When you have a conjunction, like:
a, b, c
You have two readings of this, logical and procedural. The logical reading would be:
"The conjunction is true if a is true, and b is true, and c is true."
The procedural reading is:
"The conjunction will succeed if a is evaluated and succeeds, then b is evaluated and it also succeeds, and then c is evaluated and it also succeeds." (or, to put it in other words, do a depth-first search of the solution space)
If you are careful enough, the procedural reading is not necessary to argue about your code. However, this is only the case when all subgoals in the conjunction have full overlap of the logical and the procedural reading.
The offender here is is/2. It does not have a purely logical meaning. It evaluates the right-hand operand (the arithmetic expression) and unifies the result with the left-hand (usually an unbound variable).
Here, you have a conjunction (in the body of the second clause) that says, in effect, "evaluate a(X), then, if it succeeds, find the value of X". The obvious problem is that a(X) requires the value of X to be evaluated in such a way that it terminates.
So, move the is before all subgoals that use its result, or look into using constraints.
I am currently trying to learn some basic prolog. As I learn I want to stay away from if else statements to really understand the language. I am having trouble doing this though. I have a simple function that looks like this:
if a > b then 1
else if
a == b then c
else
-1;;
This is just very simple logic that I want to convert into prolog.
So here where I get very confused. I want to first check if a > b and if so output 1. Would I simply just do:
sample(A,B,C,O):-
A > B, 1,
A < B, -1,
0.
This is what I came up with. o being the output but I do not understand how to make the 1 the output. Any thoughts to help me better understand this?
After going at it some more I came up with this but it does not seem to be correct:
Greaterthan(A,B,1.0).
Lessthan(A,B,-1.0).
Equal(A,B,C).
Sample(A,B,C,What):-
Greaterthan(A,B,1.0),
Lessthan(A,B,-1.0),
Equal(A,B,C).
Am I headed down the correct track?
If you really want to try to understand the language, I recommend using CapelliC's first suggestion:
sample(A, B, _, 1) :- A > B.
sample(A, B, C, C) :- A == B.
sample(A, B, _, -1) :- A < B.
I disagree with CappeliC that you should use the if/then/else syntax, because that way (in my experience) it's easy to fall into the trap of translating the different constructs, ending up doing procedural programming in Prolog, without fully grokking the language itself.
TL;DR: Don't.
You are trying to translate constructs you know from other programming languages to Prolog. With the assumption that learning Prolog means essentially mapping one construct after the other into Prolog. After all, if all constructs have been mapped, you will be able to encode any program into Prolog.
However, by doing that you are missing the essence of Prolog altogether.
Prolog consists of a pure, monotonic core and some procedural adornments. If you want to understand what distinguishes Prolog so much from other programming languages you really should study its core first. And that means, you should ignore those other parts. You have only so much attention span, and if you waste your time with going through all of these non-monotonic, even procedural constructs, chances are that you will miss its essence.
So, why is a general if-then-else (as it has been proposed by several answers) such a problematic construct? There are several reasons:
In the general case, it breaks monotonicity. In pure monotonic Prolog programs, adding a new fact will increase the set of true statements you can derive from it. So everything that was true before adding the fact, will be true thereafter. It is this property which permits one to reason very effectively over programs. However, note that monotonicity means that you cannot model every situation you might want to model. Think of a predicate childless/1 that should succeed if a person does not have a child. And let's assume that childless(john). is true. Now, if you add a new fact about john being the parent of some child, it will no longer hold that childless(john) is true. So there are situations that inherently demand some non-monotonic constructs. But there are many situations that can be modeled in the monotonic part. Stick to those first.
if-then-else easily leads to hard-to-read nesting. Just look at your if-then-else-program and try to answer "When will the result be -1"? The answer is: "If neither a > b is true nor a == b is true". Lengthy, isn't it? So the people who will maintain, revise and debug your program will have to "pay".
From your example it is not clear what arguments you are considering, should you be happy with integers, consider to use library(clpfd) as it is available in SICStus, SWI, YAP:
sample(A,B,_,1) :- A #> B.
sample(A,B,C,C) :- A #= B.
sample(A,B,_,-1) :- A #< B.
This definition is now so general, you might even ask
When will -1 be returned?
?- sample(A,B,C,-1).
A = B, C = -1, B in inf..sup
; A#=<B+ -1.
So there are two possibilities.
Here are some addenda to CapelliC's helpful answer:
When starting out, it is sometimes easy to mistakenly conceive of Prolog predicates functionally. They are either not functions at all, or they are n-ary functions which only ever yield true or false as outputs. However, I often find it helpful to forget about functions and just think of predicates relationally. When we define a predicate p/n, we're describing a relation between n elements, and we've named the relation p.
In your case, it sounds like we're defining conditions on an ordered triplet, <A, B, C>, where the value of C depends upon the relation between A and B. There are three relevant relationships between A and B (here, since we are dealing with a simple case, these three are exhaustive for the kind of relationship in question), and we can simply describe what value C should have in the three cases.
sample(A, B, 1.0) :-
A > B.
sample(A, B, -1.0) :-
A < B.
sample(A, B, some_value) :-
A =:= B.
Notice that I have used the arithmetical operator =:=/2. This is more specific than ==/2, and it lets us compare mathematical expressions for numerical equality. ==/2 checks for equivalence of terms: a == a, 2 == 2, 5+7 == 5+7 are all true, because equivalent terms stand on the left and right of the operator. But 5+7 == 7+5, 5+7 == 12, A == a are all false, since it are the terms themselves which are being compared and, in the first case the values are reversed, in the second we're comparing +(5,7) with an integer and in the third we're comparing a free variable with an atom. The following, however, are true: 2 =:= 2, 5 + 7 =:= 12, 2 + 2 =:= 4 + 0. This will let us unify A and B with evaluable mathematical expressions, rather than just integers or floats. We can then pose queries such as
?- sample(2^3, 2+2+2, X).
X = 1.0
?- sample(2*3, 2+2+2, X).
X = some_value.
CapelliC points out that when we write multiple clauses for a predicate, we are expressing a disjunction. He is also careful to note that this particular example works as a plain disjunction only because the alternatives are by nature mutually exclusive. He shows how to get the same exclusivity entailed by the structure of your first "if ... then ... else if ... else ..." by intervening in the resolution procedure with cuts. In fact, if you consult the swi-prolog docs for the conditional ->/2, you'll see the semantics of ->/2 explained with cuts, !, and disjunctions, ;.
I come down midway between CapelliC and SQB in prescribing use of the control predicates. I think you are wise to stick with defining such things with separate clauses while you are still learning the basics of the syntax. However, ->/2 is just another predicate with some syntax sugar, so you oughtn't be afraid of it. Once you start thinking relationally instead of functionally or imperatively, you might find that ->/2 is a very nice tool for giving concise expression to patterns of relation. I would format my clause using the control predicates thus:
sample(A, B, Out) :-
( A > B -> Out = 1.0
; A =:= B -> Out = some_value
; Out = -1.0
).
Your code has both syntactic and semantic issues.
Predicates starts lower case, and the comma represent a conjunction. That is, you could read your clause as
sample(A,B,C,What) if
greaterthan(A,B,1.0) and lessthan(A,B,-1.0) and equal(A,B,C).
then note that the What argument is useless, since it doesn't get a value - it's called a singleton.
A possible way of writing disjunction (i.e. OR)
sample(A,B,_,1) :- A > B.
sample(A,B,C,C) :- A == B.
sample(A,B,_,-1) :- A < B.
Note the test A < B to guard the assignment of value -1. That's necessary because Prolog will execute all clause if required. The basic construct to force Prolog to avoid some computation we know should not be done it's the cut:
sample(A,B,_,1) :- A > B, !.
sample(A,B,C,C) :- A == B, !.
sample(A,B,_,-1).
Anyway, I think you should use the if/then/else syntax, even while learning.
sample(A,B,C,W) :- A > B -> W = 1 ; A == B -> W = C ; W = -1.
I'd like someone to explain this procedure if possible (from the book 'learn prolog now'). It takes two numerals and adds them together.
add(0,Y,Y).
add(s(X),Y,s(Z)) :- add(X,Y,Z).
In principle I understand, but I have a few issues. Lets say I issue the query
?- add(s(s(0)), s(0), R).
Which results in:
R = s(s(s(0))).
Step 1 is the match with rule 2. Now X becomes s(0) and Y is still s(0). However Z (according to the book) becomes s(_G648), or s() with an uninstantiated variable inside it. Why is this?
On the final step the 1st rule is matched which ends the recursion. Here the contents of Y somehow end up in the uninstantiated part of what was Z! Very confusing, I need a plain english explanation.
First premises:
We have s(X) defined as the successor of X so basically s(X) = X+1
The _G### notation is used in the trace for internal variables used for the recursion
Let's first look at another definition of addition with successors that I find more intuitive:
add(0,Y,Y).
add(s(A),B,C) :- add(A,s(B),C).
this does basically the same but the recursion is easier to see:
we ask
add(s(s(0)),s(0),R).
Now in the first step prolog says thats equivalent to
add(s(0),s(s(0)),R)
because we have add(s(A),B,C) :- add(A,s(B),C) and if we look at the question A = s(0) and B=s(0). But this still doesn't terminate so we have to reapply that equivalency with A=0 and B=s(s(0)) so it becomes
add(0,s(s(s(0))),R)
which, given add(0,Y,Y). this means that
R = s(s(s(0)))
Your definition of add basically does the same but with two recursions:
First it runs the first argument down to 0 so it comes down to add(0,Y,Y):
add(s(s(0)),s(0),R)
with X=s(0), Y = s(0) and s(Z) = R and Z = _G001
add(s(0),s(0),_G001)
with X = 0, Y=s(0) and s(s(Z)) = s(G_001) = R and Z = _G002
add(0,s(0),_G002)
So now it knows that _G002 is s(0) from the definition add(0,Y,Y) but has to trace its steps back so _G001 is s(_G002) and R is s(_G001) is s(s(_G002)) is s(s(s(0))).
So the point is in order to get to the definition add(0,Y,Y) prolog has to introduce internal variables for a first recursion from which R is then evaluated in a second one.
If you want to understand the meaning of a Prolog program, you might concentrate first on what the relation describes. Then you might want to understand its termination properties.
If you go into the very details of a concrete execution as your question suggests, you will soon be lost in the multiplicity of details. After all, Prolog has two different interlaced control flows (AND- and OR-control) and in addition to that it has unification which subsumes parameter passing, assignment, comparison, and equation solving.
Brief: While computers execute a concrete query effortlessly for zillions of inferences, you will get tired after a screenful of them. You can't beat computers in that. Fortunately, there are better ways to understand a program.
For the meaning, look at the rule first. It reads:
add(s(X),Y,s(Z)) :- add(X,Y,Z).
See the :- in between? It is meant to symbolize an arrow. It is a bit unusual that the arrow points from right-to-left. In informal writing you would write it rather left-to-right. Read this as follows:
Provided, add(X,Y,Z) is true, then also add(s(X),Y,s(Z)) is true.
So we assume that we have already some add(X,Y,Z) meaning "X+Y=Z". And given that, we can conclude that also "(X+1)+Y=(Z+1)" holds.
After that you might be interested to understand it's termination properties. Let me make this very brief: To understand it, it suffices to look at the rule: The 2nd argument is only handed further on. Therefore: The second argument does not influence termination. And both the 1st and 3rd argument look the same. Therefore: They both influence termination in the same manner!
In fact, add/3 terminates, if either the 1st or the 3rd argument will not unify with s(_).
Find more about it in other answers tagged failure-slice, like:
Prolog successor notation yields incomplete result and infinite loop
But now to answer your question for add(s(s(0)), s(0), R). I only look at the first argument: Yes! This will terminate. That's it.
Let's divide the problem in three parts: the issues concerning instantiation of variables and the accumulator pattern which I use in a variation of that example:
add(0,Y,Y).
add(s(X),Y,Z):-add(X,s(Y),Z).
and a comment about your example that uses composition of substitutions.
What Prolog applies in order to see which rule (ie Horn clause) matches (whose head unifies) is the Unification Algorithm which tells, in particular, that if I have a variable, let's say, X and a funtor, ie, f(Y) those two term unify (there is a small part about the occurs check to...check but nevermind atm) hence there is a substitution that can let you convert one into another.
When your second rule is called, indeed R gets unified to s(Z). Do not be scared by the internal representation that Prolog gives to new, uninstantiated variables, it is simply a variable name (since it starts with '_') that stands for a code (Prolog must have a way to express constantly newly generated variables and so _G648, _G649, _G650 and so on).
When you call a Prolog procedure, the parameters you pass that are uninstantiated (R in this case) are used to contain the result of the procedure as it completes its execution, and it will contain the result since at some point during the procedure call it will be instantied to something (always through unification).
If at some point you have that a var, ie K is istantiated to s(H) (or s(_G567) if you prefer), it is still partilally instantiated and to have your complete output you need to recursively instantiate H.
To see what it will be instantiated to, have a read at the accumulator pattern paragraph and the sequent one, tho ways to deal with the problem.
The accumulator pattern is taken from functional programming and, in short, is a way to have a variable, the accumulator (in my case Y itself), that has the burden to carry the partial computations between some procedure calls. The pattern relies on recursion and has roughly this form:
The base step of the recursion (my first rule ie) says always that since you have reached the end of the computation you can copy the partial result (now total) from your accumulator variable to your output variable (this is the step in which, through unification your output var gets instantiated!)
The recursive step tells how to create a partial result and how to store it in the accumulator variable (in my case i 'increment' Y). Note that in the recursive step the output variable is never changed.
Finally, concerning your exemple, it follows another pattern, the composition of substitutions which I think you can understand better having thought about accumulator and instantiation via unification.
Its base step is the same as the accumulator pattern but Y never changes in the recursive step while Z does
It uses to unify the variable in Z with Y by partially instantiating all the computation at the end of each recursive call after you've reached the base step and each procedure call is ending. So at the end of the first call the inner free var in Z has been substituted by unification many times by the value in Y.
Note the code below, after you have reached the bottom call, the procedure call stack starts to pop and your partial vars (S1, S2, S3 for semplicity) gets unified until R gets fully instantiated
Here is the stack trace:
add(s(s(s(0))),s(0),S1). ^ S1=s(S2)=s(s(s(s(0))))
add( s(s(0)) ,s(0),S2). | S2=s(S3)=s(s(s(0)))
add( s(0) ,s(0),S3). | S3=s(S4)=s(s(0))
add( 0 ,s(0),S4). | S4=s(0)
add( 0 ,s(0),s(0)). ______|
I have been trying to acclimate to Prolog and Horn clauses, but the transition from formal logic still feels awkward and forced. I understand there are advantages to having everything in a standard form, but:
What is the best way to define the material conditional operator --> in Prolog, where A --> B succeeds when either A = true and B = true OR B = false? That is, an if->then statement that doesn't fail when if is false without an else.
Also, what exactly are the non-obvious advantages of Horn clauses?
What is the best way to define the material conditional operator --> in Prolog
When A and B are just variables to be bound to the atoms true and false, this is easy:
cond(false, _).
cond(_, true).
But in general, there is no best way because Prolog doesn't offer proper negation, only negation as failure, which is non-monotonic. The closest you can come with actual propositions A and B is often
(\+ A ; B)
which tries to prove A, then goes on to B if A cannot be proven (which does not mean that it is false due to the closed-world assumption).
Negation, however, should be used with care in Prolog.
Also, what exactly are the non-obvious advantages of Horn clauses?
That they have a straightforward procedural reading. Prolog is a programming language, not a theorem prover. It's possible to write programs that have a clear logical meaning, but they're still programs.
To see the difference, consider the classical problem of sorting. If L is a list of numbers without duplicates, then
sort(L, S) :-
permutation(L, S),
sorted(S).
sorted([]).
sorted([_]).
sorted([X,Y|L]) :-
X < Y,
sorted([Y|L]).
is a logical specification of what it means for S to contain the elements of L in sorted order. However, it also has a procedural meaning, which is: try all the permutations of L until you have one that it sorted. This procedure, in the worst case, runs through all n! permutations, even though sorting can be done in O(n lg n) time, making it a very poor sorting program.
See also this question.
I have the next rules
% Signature: natural_number(N)/1
% Purpose: N is a natural number.
natural_number(0).
natural_number(s(X)) :-
natural_number(X).
ackermann(0, N, s(N)). % rule 1
ackermann(s(M),0,Result):-
ackermann(M,s(0),Result). % rule 2
ackermann(s(M),s(N),Result):-
ackermann(M,Result1,Result),
ackermann(s(M),N,Result1). % rule 3
The query is: ackermann (M,N,s(s(0))).
Now, as I understood, In the third calculation, we got an infinite search (failure branch). I check it, and I got a finite search (failure branch).
I'll explain: In the first, we got a substitution of M=0, N=s(0) (rule 1 - success!). In the second, we got a substitution of M=s(0),N=0 (rule 2 - success!). But what now? I try to match M=s(s(0)) N=0, But it got a finite search - failure branch. Why the compiler doesn't write me "fail".
Thank you.
It was a bit hard to understand exactly what Tom is asking here. Perhaps there's an expectation that the predicate natural_number/1 somehow influences the execution of ackermann/3. It will not. The latter predicate is purely recursive and makes no subgoals that depend on natural_number/1.
When the three clauses shown are defined for ackermann/3, the goal:
?- ackermann(M,N,s(s(0))).
causes SWI-Prolog to find (with backtracking) the two solutions that Tom reports, and then to go into infinite recursion (resulting in an "Out of Stack" error). We can be sure that this infinite recursion involves the third clause given for ackermann/3 (rule 3 per Tom's comments in code) because in its absence we only get the two acknowledged solutions and then explicit failure:
M = 0,
N = s(0) ;
M = s(0),
N = 0 ;
false.
It seems to me Tom asks for an explanation of why changing the submitted query to one that sets M = s(s(0)) and N = 0, producing a finite search (that finds one solution and then fails on backtracking), is consistent with the infinite recursion produced by the previous query. My suspicion here is that there's a misunderstanding of what the Prolog engine attempts in backtracking (for the original query), so I'm going to drill down on that. Hopefully it clears up the matter for Tom, but let's see if it does. Admittedly my treatment is wordy, but the Prolog execution mechanism (unification and resolution of subgoals) is worthy of study.
[Added: The predicate has an obvious connection to the famous Ackermann function that is total computable but not primitive recursive. This function is known for growing rapidly, so we need to be careful in claiming infinite recursion because a very large but finite recursion is also possible. However the third clause puts its two recursive calls in an opposite order to what I would have done, and this reversal seems to play a critical role in the infinite recursion we find in stepping through the code below.]
When the top-level goal ackermann(M,N,s(s(0))) is submitted, SWI-Prolog tries the clauses (facts or rules) defined for ackermann/3 until it finds one whose "head" unifies with the given query. The Prolog engine does not have far to look as the first clause, this fact:
ackermann(0, N, s(N)).
will unify, binding M = 0 and N = s(0) as has already been described as the first success.
If requested to backtrack, e.g. by user typing semi-colon, the Prolog engine checks to see if there is an alternative way to satisfy this first clause. There is not. Then the Prolog engine proceeds to attempt the following clauses for ackermann/3 in their given order.
Again the search does not have to go far because the second clause's head also unifies with the query. In this case we have a rule:
ackermann(s(M),0,Result) :- ackermann(M,s(0),Result).
Unifying the query and the head of this rule yields the bindings M = s(0), N = 0 in terms of the variables used in the query. In terms of the variables used in the rule as stated above, M = 0 and Result = s(s(0)). Note that unification matches terms by their appearance as calling arguments and does not consider variable names reused across the query/rule boundary as signifying identity.
Because this clause is a rule (having body as well as head), unification is just the first step in trying to succeed with it. The Prolog engine now attempts the one subgoal that appears in the body of this rule:
ackermann(0,s(0),s(s(0))).
Note that this subgoal comes from replacing the "local" variables used in the rule by the values of unification, M = 0 and Result = s(s(0)). The Prolog engine is now calling the predicate ackermann/3 recursively, to see if this subgoal can be satisfied.
It can, as the first clause (fact) for ackermann/3 unifies in the obvious way (indeed in essentially the same way as before as regards the variables used in the clause). And thus (upon this recursive call succeeding), we get the second solution succeeding in the outer call (the top-level query).
If the user asks the Prolog engine to backtrack once more, it again checks to see if the current clause (the second one for ackermann/3) can be satisfied in an alternative way. It cannot, and so the search continues by passing to the third (and last) clause for predicate ackermann/3:
ackermann(s(M),s(N),Result) :-
ackermann(M,Result1,Result),
ackermann(s(M),N,Result1).
I'm about to explain that this attempt does produce infinite recursion. When we unify the top-level query with the head of this clause, we get bindings for the arguments that can perhaps be clearly understood by aligning the terms in in the query with those in the head:
query head
M s(M)
N s(N)
s(s(0)) Result
Bearing in mind that variables having the same name in the query as variables in the rule does not constrain unification, this triple of terms can be unified. Query M will be head s(M), that is a compound term involving functor s applied to some as-yet unknown variable M appearing in the head. Same thing for query N. The only "ground" term so far is variable Result appearing in the head (and body) of the rule, which has been bound to s(s(0)) from the query.
Now the third clause is a rule, so the Prolog engine must continue by attempting to satisfy the subgoals appearing in the body of that rule. If you substitute values from the head unification into the body, the first subgoal to satisfy is:
ackermann(M,Result1,s(s(0))).
Let me point out that I've used here the "local" variables of the clause, except that I've replaced Result by the value to which it was bound in unification. Now notice that apart from replacing N of the original top-level query by variable name Result1, we are just asking the same thing as the original query in this subgoal. Certainly it's a big clue we might be about to enter an infinite recursion.
However a bit more discussion is needed to see why we don't get any further solutions reported! This is because the first success of that first subgoal is (just as found earlier) going to require M = 0 and Result1 = s(0), and the Prolog engine must then proceed to attempt the second subgoal of the clause:
ackermann(s(0),N,s(0)).
Unfortunately this new subgoal does not unify with the first clause (fact) for ackermann/3. It does unify with the head of the second clause, as follows:
subgoal head
s(0) s(M)
N 0
s(0) Result
but this leads to a sub-subgoal (from the body of the second clause):
ackermann(0,s(0),s(0)).
This does not unify with the head of either the first or second clause. It also does not unify with the head of the third clause (which requires the first argument to have the form s(_)). So we reached a point of failure in the search tree. The Prolog engine now backtracks to see if the first subgoal of the third clause's body can be satisfied in an alternative way. As we know, it can be (since this subgoal is basically the same as the original query).
Now M = s(0) and Result1 = 0 of that second solution leads to this for the second subgoal of the third clause's body:
ackermann(s(s(0)),N,0).
While this does not unify with the first clause (fact) of the predicate, it does unify with the head of the second clause:
subgoal head
s(s(0)) s(M)
N 0
0 Result
But in order to succeed the Prolog engine must satisfy the body of the second clause as well, which is now:
ackermann(s(s(0)),s(0),0).
We can easily see this cannot unify with the head of either the first or second clause for ackermann/3. It can be unified with the head of the third clause:
sub-subgoal head(3rd clause)
s(s(0)) s(M)
s(0) s(N)
0 Result
As should be familiar now, the Prolog engine checks to see if the first subgoal of the third clause's body can be satisfied, which amounts to this sub-sub-subgoal:
ackermann(s(0),Result1,0).
This fails to unify with the first clause (fact), but does unify with the head of the second clause binding M = 0, Result1 = 0 and Result = 0 , producing (by familiar logic) the sub-sub-sub-subgoal:
ackermann(0,0,0).
Since this cannot be unified with any of the three clauses' heads, this fails. At this point the Prolog engine backtracks to trying to satisfy the above sub-sub-subgoal using the third clause. Unification goes like this:
sub-sub-subgoal head(3rd clause)
s(0) s(M)
Result1 s(N)
0 Result
and the Prolog engine's task is then to satisfy this sub-sub-sub-subgoal derived from the first part of the third clause's body:
ackermann(0,Result1,0).
But this will not unify with the head of any of the three clauses. The search for a solution to the sub-sub-subgoal above terminates in failure. The Prolog engine backtracks all the way to where it first tried to satisfy the second subgoal of the third clause as invoked by the original top-level query, as this has now failed. In other words it tried to satisfy it with the first two solutions of the first subgoal of the third clause, which you will recall was in essence the same except for change of variable names as the original query:
ackermann(M,Result1,s(s(0))).
What we've seen above are that solutions for this subgoal, duplicating the original query, from the first and second clauses of ackermann/3, do not permit the second subgoal of the third clause's body to succeed. Therefore the Prolog engine tries to find solutions that satisfy the third clause. But clearly this is now going into infinite recursion, as that third clause will unify in its head, but the body of the third clause will repeat exactly the same search we just chased through. So the Prolog engine now winds up going into the body of the third clause endlessly.
Let me rephrase your question: The query ackermann(M,N,s(s(0))). finds two solutions and then loops. Ideally, it would terminate after these two solutions, as there is no other N and M whose value is s(s(0)).
So why does the query not terminate universally? Understanding this can be quite complex, and the best advice is to not attempt to understand the precise execution mechanism. There is a very simple reason: Prolog's execution mechanism turns out to be that complex that you easily will misunderstand it anyway if you attempt to understand it by stepping through the code.
Instead, you can try the following: Insert goals false at any place in your program. If the resulting program does not terminate, then also the original program will not terminate.
In your case:
ackermann(0, N, s(N)) :- false.
ackermann(s(M),0,Result):- false,
ackermann(M,s(0),Result).
ackermann(s(M),s(N),Result):-
ackermann(M,Result1,Result), false,
ackermann(s(M),N,Result1).
We can now remove the first and second clause. And in the third clause, we can remove the goal after false. So if the following fragment does not terminate, also the original program will not terminate.
ackermann(s(M),s(N),Result):-ackermann(M,Result1,Result), false.
This program now terminates only if the first argument is known. But in our case it's free...
That is: By considering a small fraction of the program (called a failure-slice) we already were able to deduce a property of the entire program. For details, see this paper and others on the site.
Unfortunately, that kind of reasoning only works for cases of non-termination. For termination, things are more complex. The best is to try a tool like cTI which infers termination conditions and tries to prove their optimality. I entered your program already, so try to modify if and see the effects!
If we are at it: This small fragment also tells us that the second argument does not influence termination1. That means, that queries like ackermann(s(s(0)),s(s(0)),R). will
not terminate either. Exchange the goals to see the difference...
1 To be precise, a term that does not unify with s(_) will influence termination. Think of 0. But any s(0), s(s(0)), ... will not influence termination.