Recursive predicate to output difference of lists (without duplicated elements) - prolog

I have the below recursive predicate to check the set difference in lists and output these.
I have this working but also outputs duplicated values.
Could anyone tell me how to fix this so that the output is without duplicated values. Thanks
setDiff([],Y,[]).
setDiff([X|R],Y,Z) :- member(X,Y), setDiff(R,Y,Z).
setDiff([X|R],Y,[X|Z]) :- \+(member(X,Y)), setDiff(R,Y,Z).
Expected output
?- setDiff([1,2,3,3,a], [b,d,2], X).
X = [a, 3, 1] ;
actual output
?- setDiff([1,2,3,3,a], [b,d,2], X).
X = [a, 3, 3, 1] ;

A simple solution is to add a condition in rule 3, which unifies the third argument with [X|Z] if X isn't already in Z, and unifies it with Z if X is already in Z:
setDiff([],_,[]).
setDiff([X|R],Y,Z) :- member(X,Y), setDiff(R,Y,Z).
setDiff([X|R],Y,L) :-
\+ member(X,Y),
setDiff(R,Y,Z),
( member(X, Z) ->
L = Z
; L = [X|Z]
).
Note: I replaced Y with _ in the first rule, to avoid a "singleton variable" warning: there is indeed no need to name Y in that case because it is never used in the rule. This warning helps you detect probable errors where a named variable is only used once (which is never useful).

If you want to keep it tail-recursive (so more efficient), you can use two parameters for the output list:
setDiff(Xs,Ys,Zs) :-
setDiff(Xs,Ys,[],RevZs), reverse(RevZs,Zs).
setDiff([X|Xs],Ys,Zs0,Zs) :-
( ( member(X,Ys) ; member(X,Zs0) ) ->
Zs1 = Zs0
; Zs1 = [X|Zs0] ),
setDiff(Xs,Ys,Zs1,Zs).
setDiff([],_,Zs,Zs).
(This also avoids calling member/2 twice with the same parameters.)

Related

Understanding Prolog Lists

I am trying to understand Prolog lists, and how values are 'returned' / instantiated at the end of a recursive function.
I am looking at this simple example:
val_and_remainder(X,[X|Xs],Xs).
val_and_remainder(X,[Y|Ys],[Y|R]) :-
val_and_remainder(X,Ys,R).
If I call val_and_remainder(X, [1,2,3], R). then I will get the following outputs:
X = 1, R = [2,3];
X = 2, R = [1,3];
X = 3, R = [1,2];
false.
But I am confused as to why in the base case (val_and_remainder(X,[X|Xs],Xs).) Xs has to appear as it does.
If I was to call val_and_remainder(2, [1,2,3], R). then it seems to me as though it would run through the program as:
% Initial call
val_and_remainder(2, [1,2,3], R).
val_and_remainder(2, [1|[2,3]], [1|R]) :- val_and_remainder(2, [2,3], R).
% Hits base case
val_and_remainder(2, [2|[3]], [3]).
If the above run through is correct then how does it get the correct value for R? As in the above case the value of R should be R = [1,3].
In Prolog, you need to think of predicates not as functions as you would normally in other languages. Predicates describe relationships which might include arguments that help define that relationship.
For example, let's take this simple case:
same_term(X, X).
This is a predicate that defines a relationship between two arguments. Through unification it is saying that the first and second arguments are the same if they are unified (and that definition is up to us, the writers of the predicate). Thus, same_term(a, a) will succeed, same_term(a, b) will fail, and same_term(a, X) will succeed with X = a.
You could also write this in a more explicit form:
same_term(X, Y) :-
X = Y. % X and Y are the same if they are unified
Now let's look at your example, val_and_remainder/3. First, what does it mean?
val_and_remainder(X, List, Rest)
This means that X is an element of List and Rest is a list consisting of all of the rest of the elements (without X). (NOTE: You didn't explain this meaning right off, but I'm determining this meaning from the implementation your example.)
Now we can write out to describe the rules. First, a simple base case:
val_and_remainder(X,[X|Xs],Xs).
This says that:
Xs is the remainder of list [X|Xs] without X.
This statement should be pretty obvious by the definition of the [X|Xs] syntax for a list in Prolog. You need all of these arguments because the third argument Xs must unify with the tail (rest) of list [X|Xs], which is then also Xs (variables of the same name are, by definition, unified). As before, you could write this out in more detail as:
val_and_remainder(X, [H|T], R) :-
X = H,
R = T.
But the short form is actually more clear.
Now the recursive clause says:
val_and_remainder(X, [Y|Ys], [Y|R]) :-
val_and_remainder(X, Ys, R).
So this means:
[Y|R] is the remainder of list [Y|Ys] without X if R is the remainder of list Ys without the element X.
You need to think about that rule to convince yourself that it is logically true. The Y is the same in second and third arguments because they are referring to the same element, so they must unify.
So these two predicate clauses form two rules that cover both cases. The first case is the simple case where X is the first element of the list. The second case is a recursive definition for when X is not the first element.
When you make a query, such as val_and_remainder(2, [1,2,3], R). Prolog looks to see if it can unify the term val_and_remainder(2, [1,2,3], R) with a fact or the head of one of your predicate clauses. It fails in its attempt to unify with val_and_remainder(X,[X|Xs],Xs) because it would need to unify X with 2, which means it would need to unify [1,2,3] with [2|Xs] which fails since the first element of [1,2,3] is 1, but the first element of [2|Xs] is 2.
So Prolog moves on and successfully unifies val_and_remainder(2, [1,2,3], R) with val_and_remainder(X,[Y|Ys],[Y|R]) by unifying X with 2, Y with 1, Ys with [2,3], and R with [Y|R] (NOTE, this is important, the R variable in your call is NOT the same as the R variable in the predicate definition, so we should name this R1 to avoid that confusion). We'll name your R as R1 and say that R1 is unified with [Y|R].
When the body of the second clause is executed, it calls val_and_remainder(X,Ys,R). or, in other words, val_and_remainder(2, [2,3], R). This will unify now with the first clause and give you R = [3]. When you unwind all of that, you get, R1 = [Y|[3]], and recalling that Y was bound to 1, the result is R1 = [1,3].
Stepwise reproduction of Prolog's mechanism often leads to more confusion than it helps. You probably have notions like "returning" meaning something very specific—more appropriate to imperative languages.
Here are different approaches you can always use:
Ask the most general query
... and let Prolog explain you what the relation is about.
?- val_and_remainder(X, Xs, Ys).
Xs = [X|Ys]
; Xs = [_A,X|_B], Ys = [_A|_B]
; Xs = [_A,_B,X|_C], Ys = [_A,_B|_C]
; Xs = [_A,_B,_C,X|_D], Ys = [_A,_B,_C|_D]
; Xs = [_A,_B,_C,_D,X|_E], Ys = [_A,_B,_C,_D|_E]
; ... .
So Xs and Ys share a common list prefix, Xs has thereafter an X, followed by a common rest. This query would continue producing further answers. Sometimes, you want to see all answers, then you have to be more specific. But don't be too specific:
?- Xs = [_,_,_,_], val_and_remainder(X, Xs, Ys).
Xs = [X,_A,_B,_C], Ys = [_A,_B,_C]
; Xs = [_A,X,_B,_C], Ys = [_A,_B,_C]
; Xs = [_A,_B,X,_C], Ys = [_A,_B,_C]
; Xs = [_A,_B,_C,X], Ys = [_A,_B,_C]
; false.
So here we got all possible answers for a four-element list. All of them.
Stick to ground goals when going through specific inferences
So instead of val_and_remainder(2, [1,2,3], R). (which obviously got your head spinning) rather consider val_and_remainder(2, [1,2,3], [1,3]). and then
val_and_remainder(2, [2,3],[3]). From this side it should be obvious.
Read Prolog rules right-to-left
See Prolog rules as production rules. Thus, whenever everything holds on the right-hand side of a rule, you can conclude what is on the left. Thus, the :- is an early 1970s' representation of a ←
Later on, you may want to ponder more complex questions, too. Like
Functional dependencies
Does the first and second argument uniquely determine the last one? Does X, Xs → Ys hold?
Here is a sample query that asks for Ys and Ys2 being different for the same X and Xs.
?- val_and_remainder(X, Xs, Ys), val_and_remainder(X, Xs, Ys2), dif(Ys,Ys2).
Xs = [X,_A,X|_B], Ys = [_A,X|_B], Ys2 = [X,_A|_B], dif([_A,X|_B],[X,_A|_B])
; ... .
So apparently, there are different values for Ys for a given X and Xs. Here is a concrete instance:
?- val_and_remainder(x, [x,a,x], Ys).
Ys = [a,x]
; Ys = [x,a]
; false.
There is no classical returning here. It does not return once but twice. It's more of a yield.
Yet, there is in fact a functional dependency between the arguments! Can you find it? And can you Prolog-wise prove it (as much as Prolog can do a proof, indeed).
From comment:
How the result of R is correct, because if you look at my run-though
of a program call, the value of Xs isn't [1,3], which is what it
eventually outputs; it is instead [3] which unifies to R (clearly I am
missing something along the way, but I am unsure what that is).
This is correct:
% Initial call
val_and_remainder(2, [1,2,3], R).
val_and_remainder(2, [1|[2,3]], [1|R]) :- val_and_remainder(2, [2,3], R).
% Hits base case
val_and_remainder(2, [2|[3]], [3]).
however Prolog is not like other programming languages where you enter with input and exit with output at a return statement. In Prolog you move forward through the predicate statements unifying and continuing with predicates that are true, and upon backtracking also unifying the unbound variables. (That is not technically correct but it is easier to understand for some if you think of it that way.)
You did not take into consideration the the unbound variables that are now bound upon backtracking.
When you hit the base case Xs was bound to [3],
but when you backtrack you have look at
val_and_remainder(2, [1|[2,3]], [1|R])
and in particular [1|R] for the third parameter.
Since Xs was unified with R in the call to the base case, i.e.
val_and_remainder(X,[X|Xs],Xs).
R now has [3].
Now the third parameter position in
val_and_remainder(2, [1|[2,3]], [1|R])
is [1|R] which is [1|[3]] which as syntactic sugar is [1,3] and not just [3].
Now when the query
val_and_remainder(2, [1,2,3], R).
was run, the third parameter of the query R was unified with the third parameter of the predicate
val_and_remainder(X,[Y|Ys],[Y|R])
so R was unified with [Y|R] which unpon backtracking is [1,3]
and thus the value bound to the query variable R is [1,3]
I don't understand the name of your predicate. It is a distraction anyway. The non-uniform naming of the variables is a distraction as well. Let's use some neutral, short one-syllable names to focus on the code itself in its clearest form:
foo( H, [H | T], T). % 1st clause
foo( X, [H | T], [H | R]) :- foo( X, T, R). % 2nd clause
So it's the built-in select/3. Yay!..
Now you ask about the query foo( 2, [1,2,3], R) and how does R gets its value set correctly. The main thing missing from your rundown is the renaming of variables when a matching clause is selected. The resolution of the query goes like this:
|- foo( 2, [1,2,3], R) ? { }
%% SELECT -- 1st clause, with rename
|- ? { foo( H1, [H1|T1], T1) = foo( 2, [1,2,3], R) }
**FAIL** (2 = 1)
**BACKTRACK to the last SELECT**
%% SELECT -- 2nd clause, with rename
|- foo( X1, T1, R1) ?
{ foo( X1, [H1|T1], [H1|R1]) = foo( 2, [1,2,3], R) }
**OK**
%% REWRITE
|- foo( X1, T1, R1) ?
{ X1=2, [H1|T1]=[1,2,3], [H1|R1]=R }
%% REWRITE
|- foo( 2, [2,3], R1) ? { R=[1|R1] }
%% SELECT -- 1st clause, with rename
|- ? { foo( H2, [H2|T2], T2) = foo( 2, [2,3], R1), R=[1|R1] }
** OK **
%% REWRITE
|- ? { H2=2, T2=[3], T2=R1, R=[1|R1] }
%% REWRITE
|- ? { R=[1,3] }
%% DONE
The goals between |- and ? are the resolvent, the equations inside { } are the substitution. The knowledge base (KB) is implicitly to the left of |- in its entirety.
On each step, the left-most goal in the resolvent is chosen, a clause with the matching head is chosen among the ones in the KB (while renaming all of the clause's variables in the consistent manner, such that no variable in the resolvent is used by the renamed clause, so there's no accidental variable capture), and the chosen goal is replaced in the resolvent with that clause's body, while the successful unification is added into the substitution. When the resolvent is empty, the query has been proven and what we see is the one successful and-branch in the whole and-or tree.
This is how a machine could be doing it. The "rewrite" steps are introduced here for ease of human comprehension.
So we can see here that the first successful clause selection results in the equation
R = [1 | R1 ]
, and the second, --
R1 = [3]
, which together entail
R = [1, 3]
This gradual top-down instantiation / fleshing-out of lists is a very characteristic Prolog's way of doing things.
In response to the bounty challenge, regarding functional dependency in the relation foo/3 (i.e. select/3): in foo(A,B,C), any two ground values for B and C uniquely determine the value of A (or its absence):
2 ?- foo( A, [0,1,2,1,3], [0,2,1,3]).
A = 1 ;
false.
3 ?- foo( A, [0,1,2,1,3], [0,1,2,3]).
A = 1 ;
false.
4 ?- foo( A, [0,1,2,1,3], [0,1,2,4]).
false.
f ?- foo( A, [0,1,1], [0,1]).
A = 1 ;
A = 1 ;
false.
Attempt to disprove it by a counterargument:
10 ?- dif(A1,A2), foo(A1,B,C), foo(A2,B,C).
Action (h for help) ? abort
% Execution Aborted
Prolog fails to find a counterargument.
Tying to see more closely what's going on, with iterative deepening:
28 ?- length(BB,NN), foo(AA,BB,CC), XX=[AA,BB,CC], numbervars(XX),
writeln(XX), (NN>3, !, fail).
[A,[A],[]]
[A,[A,B],[B]]
[A,[B,A],[B]]
[A,[A,B,C],[B,C]]
[A,[B,A,C],[B,C]]
[A,[B,C,A],[B,C]]
[A,[A,B,C,D],[B,C,D]]
false.
29 ?- length(BB,NN), foo(AA,BB,CC), foo(AA2,BB,CC),
XX=[AA,AA2,BB,CC], numbervars(XX), writeln(XX), (NN>3, !, fail).
[A,A,[A],[]]
[A,A,[A,B],[B]]
[A,A,[A,A],[A]]
[A,A,[A,A],[A]]
[A,A,[B,A],[B]]
[A,A,[A,B,C],[B,C]]
[A,A,[A,A,B],[A,B]]
[A,A,[A,A,A],[A,A]]
[A,A,[A,A,B],[A,B]]
[A,A,[B,A,C],[B,C]]
[A,A,[B,A,A],[B,A]]
[A,A,[A,A,A],[A,A]]
[A,A,[B,A,A],[B,A]]
[A,A,[B,C,A],[B,C]]
[A,A,[A,B,C,D],[B,C,D]]
false.
AA and AA2 are always instantiated to the same variable.
There's nothing special about the number 3, so it is safe to conjecture by generalization that it will always be so, for any length tried.
Another attempt at Prolog-wise proof:
ground_list(LEN,L):-
findall(N, between(1,LEN,N), NS),
member(N,NS),
length(L,N),
maplist( \A^member(A,NS), L).
bcs(N, BCS):-
bagof(B-C, A^(ground_list(N,B),ground_list(N,C),foo(A,B,C)), BCS).
as(N, AS):-
bagof(A, B^C^(ground_list(N,B),ground_list(N,C),foo(A,B,C)), AS).
proof(N):-
as(N,AS), bcs(N,BCS),
length(AS,N1), length(BCS, N2), N1 =:= N2.
This compares the number of successful B-C combinations overall with the number of As they produce. Equality means one-to-one correspondence.
And so we have,
2 ?- proof(2).
true.
3 ?- proof(3).
true.
4 ?- proof(4).
true.
5 ?- proof(5).
true.
And so for any N it holds. Getting slower and slower. A general, unlimited query is trivial to write, but the slowdown seems exponential.

Using repeat to sort a collection of facts in prolog

I have a set of facts set/2 where the first variable is the identifier for the set and the second is the value associated with the identifier.
For example:
set(a,2).
set(a,c).
set(a,1).
set(a,a).
set(a,3).
set(a,b).
I need to construct a predicate ordering/2 (using the repeat operator) which will output the values of a specific set in their lexicographic order.
For example
?- ordering(a,Output).
Would result in
[1,2,3,a,b,c].
What I have made thus far is this code:
ordering(Input,Output):-
findall(X,set(Input,X),List),
repeat,
doSort(List)
sort(List, OrderedList),
Output = OrderedList.
The idea here is that the predicate will find all values of the set associated with the specific Input and unify the List variable with them. Now we have the unsorted List. Here comes the part I'm not completely sure on. The predicate is supposed to keep using some sort of specific doSort on the List, then check the List with sort/2 and if it's lexicographically ordered, unify it with the Output.
Can anyone clarify if I'm on the correct path and if yes then how should the doSort be implemented?
Alright, I tried a sort of answer for this using #Daniel lyon's help:
ordering(Input,Output):-
findall(X,set(Input,X),List),
repeat,
permutation(List,PermutationList),
sort(PermutationList, SortedList),
Output= SortedList , !.
The general idea is the same, for the repeat cycle, the predicate will unify the List with PermutationList, try all variants of it and check for their correctness with sort/2 until it achieves the correct permutation, unifying it with SortedList, after that it will unify the Output with SortedList. The cut is there so I will only get the Output once.
?- % init test DB
| maplist([X]>>assert(set(a,X)), [c,b,a,1,2,3]).
true.
?- % find first
| set(a,X), \+ (set(a,Y), Y #< X).
X = 1 ;
false.
?- % find next, given - hardcoded here as 1 - previous
| set(a,X), X #> 1, \+ (set(a,Y), Y #> 1, Y #< X).
X = 2 ;
false.
now we can try to make these queries reusable:
ordering(S,[H|T]) :- first(S,H), ordering(S,H,T).
first(S,X) :- set(S,X), \+ (set(S,Y), Y #< X).
next(S,P,X) :- set(S,X), X #> P, \+ (set(S,Y), Y #> P, Y #< X).
ordering(S,P,[X|T]) :- next(S,P,X), ordering(S,X,T).
ordering(_,_,[]).
To be correct, we need a cut somewhere. Can you spot the place ?

Does prolog have a spread/splat/*args operator?

In many procedural languages (such as python), I can "unpack" a list and use it as arguments for a function. For example...
def print_sum(a, b, c):
sum = a + b + c
print("The sum is %d" % sum)
print_sum(*[5, 2, 1])
This code will print: "The sum is 8"
Here is the documentation for this language feature.
Does prolog have a similar feature?
Is there a way to replicate this argument-unpacking behaviour in Prolog?
For example, I'd like to unpack a list variable before passing it into call.
Could I write a predicate like this?
assert_true(Predicate, with_args([Input])) :-
call(Predicate, Input).
% Where `Input` is somehow unpacked before being passed into `call`.
...That I could then query with
?- assert_true(reverse, with_args([ [1, 2, 3], [3, 2, 1] ])).
% Should be true, currently fails.
?- assert_true(succ, with_args([ 2, 3 ]).
% Should be true, currently fails.
?- assert_true(succ, with_args([ 2, 4 ]).
% Should be false, currently fails.
Notes
You may think that this is an XY Problem. It could be, but don't get discouraged. It'd be ideal to receive an answer for just my question title.
You may tell me that I'm approaching the problem poorly. I know your intentions are good, but this kind of advice won't help to answer the question. Please refer to the above point.
Perhaps I'm approaching Prolog in too much of a procedural mindset. If this is the case, then what mindset would help me to solve the problem?
I'm using SWI-Prolog.
The built-in (=..)/2 (univ) serves this purpose. E.g.
?- G =.. [g, 1, 2, 3].
G = g(1,2,3).
?- g(1,2,3) =.. Xs.
Xs = [g,1,2,3].
However, note that many uses of (=..)/2 where the number of arguments is fixed can be replaced by call/2...call/8.
First: it is too easy, using unification and pattern matching, to get the elements of a list or the arguments of any term, if you know its shape. In other words:
sum_of_three(X, Y, Z, Sum) :- Sum is X+Y+Z.
?- List = [5, 2, 1],
List = [X, Y, Z], % or List = [X, Y, Z|_] if the list might be longer
sum_of_three(X, Y, Z, Sum).
For example, if you have command line arguments, and you are interested only in the first two command line arguments, it is easy to get them like this:
current_prolog_flag(argv, [First, Second|_])
Many standard predicates take lists as arguments. For example, any predicate that needs a number of options, as open/3 and open/4. Such a pair could be implemented as follows:
open(SrcDest, Mode, Stream) :-
open(SrcDest, Mode, Stream, []).
open(SrcDest, Mode, Stream, Options) :-
% get the relevant options and open the file
Here getting the relevant options can be done with a library like library(option), which can be used for example like this:
?- option(bar(X), [foo(1), bar(2), baz(3)]).
X = 2.
So this is how you can pass named arguments.
Another thing that was not mentioned in the answer by #false: in Prolog, you can do things like this:
Goal = reverse(X, Y), X = [a,b,c], Y = [c,b,a]
and at some later point:
call(Goal)
or even
Goal
To put it differently, I don't see the point in passing the arguments as a list, instead of just passing the goal as a term. At what point are the arguments a list, and why are they packed into a list?
To put it differently: given how call works, there is usually no need for unpacking a list [X, Y, Z] to a conjunction X, Y, Z that you can then use as an argument list. As in the comment to your question, these are all fine:
call(reverse, [a,b,c], [c,b,a])
and
call(reverse([a,b,c]), [c,b,a])
and
call(reverse([a,b,c], [c,b,a]))
The last one is the same as
Goal = reverse([a,b,c], [c,b,a]), Goal
This is why you can do something like this:
?- maplist(=(X), [Y, Z]).
X = Y, Y = Z.
instead of writing:
?- maplist(=, [X,X], [Y, Z]).
X = Y, Y = Z.

PROLOG defining 'delete' predicate

delete(X,[X|R],[_|R]).
delete(X,[F|R],[F|S]) :-
delete(X,R,S).
Above is my definition of delete predicate, for delete(X,L,R), intended to delete every occurrence of X in L with result R.
I had queried below, and get "G2397797". What does this string stand for?
?- delete(1,[1,2,3,4,5],X).
X = [_G2397797, 2, 3, 4, 5] .
If you simply correct your first clause and remove the unnecessary anonymous variable, you would get:
delete_each(X, [X|L], L).
delete_each(X, [Y|Ys], [Y|Zs]) :-
delete_each(X, Ys, Zs).
This will use unification, and delete each occurrence of X in the list upon backtracking:
?- delete_each(a, [a,b,a,c], R).
R = [b, a, c] ;
R = [a, b, c] ;
false.
Do you see how this is identical to select/3?
If you want to delete all occurrences of X in the list, you can see the answer by #coder.
In the answer you get X = [_G2397797, 2, 3, 4, 5] . , _G2397797 is not a string it is a variable that is not instantiated. This is due to the clause:
delete(X,[X|R],[_|R]).
which places in the output list an anonymous variable "_". You could write delete(X,[X|R],R).
But this has multiple problems. Firstly it only deletes the first occurrence of X not all because in the above clause when you find one you succeed. Also you haven't thought the case of empty list which is also the base case of the recursion. Finally in your second clause you haven't applied any rule that says F and X differ and this clause give wrong results when F equals to X.
So you could write:
delete(_,[],[]).
delete(X,[X|R],S):-delete(X,R,S).
delete(X,[F|R],[F|S]):-dif(X,F),delete(R,S).

What does this wildcard do in this prolog scenario?

I've come across this code:
connectRow(_,_,0).
connectRow([spot(_,R,_,_)|Spots],R,K) :- K1 is K-1, connectRow(Spots,R,K1).
/*c*/
connectRows([]).
connectRows(Spots) :-
connectRow(Spots,_,9),
skip(Spots,9,Spots1),
connectRows(Spots1).
How does the wildcard in the connectRow(Spots,_,9) work? How does it know which values to check and how does it know that it checked all the possible values?
Edit: I think I understand why this works but I'd like it if someone could verify this for me:
When I "call" the connectRow with the wildcard it matches the wildcard with the "R" in the connectRow predicate. Could this be it?
The _ is just like any other variable, except that each one you see is treated as a different variable and Prolog won't show you what it unifies with. There's no special behavior there; if it confuses you about the behavior, just invent a completely new variable and put it in there to see what it does.
Let's talk about how Prolog deals with variables. Here's an experiment you can follow along with that should undermine unhelpful preconceived notions if you happen to have them.
?- length([2,17,4], X)
X = 3.
A lot of Prolog looks like this and it's easy to fall into the trap of thinking that there are designated "out" variables that work like return values and designated "in" variables that work like parameters. After all:
?- length([2,17,4], 3).
true.
?- length([2,17,4], 5).
false.
Here we begin to see that something interesting is happening. A faulty intuition would be that Prolog is somehow keeping track of the input and output variables and "checking" in this case. That's not what's happening though, because unification is more general than that. Observe:
?- length(X, 3).
X = [_G2184, _G2187, _G2190].
We've now turned the traditional parameter/return value on its head: Prolog knows that X is a list three items long, but doesn't know what the items actually are. Believe it or not, this technique is frequently used to generate variables when you know how many you need but you don't need to have them individually named.
?- length(X, Y).
X = [],
Y = 0 ;
X = [_G2196],
Y = 1 ;
X = [_G2196, _G2199],
Y = 2 ;
X = [_G2196, _G2199, _G2202],
Y = 3
It happens that the definition of length is very general and Prolog can use it to generate lists along with their lengths. This kind of behavior is part of what makes Prolog so good at "generate and test" solutions. You define your problem logically and Prolog should be able to generate logically sound values to test.
All of this variation springs from a pretty simple definition of length:
length([], 0).
length([_|Rest], N1) :-
length(Rest, N0),
succ(N0, N1).
The key is to not read this like a procedure for calculating length but instead to see it as a logical relation between lists and numbers. The definition is inductive, relating the empty list to 0 and a list with some items to 1 + the length of the remainder of the list. The engine that makes this work is called unification.
In the first case, length([2,17,4], X), the value [17,4] is unified with Rest, N0 with 2 and N1 with 3. The process is recursive. In the final case, X is unified with [] and Y with 0, which leads naturally to the next case where we have some item and Y is 1, and the fact that the variable representing the item in the list doesn't have anything in particular to unify with doesn't matter because the value of that variable is never used.
Looking at your problem we see the same sort of recursive structure. The predicates are quite complex, so let's take them in pieces.
connectRow(_, _, 0).
This says connectRow(X, Y, 0) is true, regardless of X and Y. This is the base case.
connectRow([spot(_, R, _, _)|Spots], R, K) :-
This rule is matching a list of spots of a particular structure, presuming the first spot's second value (R) matches the second parameter.
K1 is K-1, connectRow(Spots, R, K1).
The body of this clause is essentially recurring on decrementing K, the third parameter.
It's clear now that this is basically going to generate a list that looks like [spot(_, R, _, _), spot(_, R, _, _), ... spot(_, R, _, _)] with length = K and no particular values in the other three positions for spot. And indeed that's what we see when we test it:
?- connectRow(X, Y, 0).
true ;
(infinite loop)^CAction (h for help) ? abort
% Execution Aborted
?- connectRow(X, Y, 2).
X = [spot(_G906, Y, _G908, _G909), spot(_G914, Y, _G916, _G917)|_G912] ;
(infinite loop)^CAction (h for help) ? abort
So there seem to be a few bugs here; if I were sure these were the whole story I would say:
The base case should use the empty list rather than matching anything
We should stipulate in the inductive case that K > 0
We should use clpfd if we want to be able to generate all possibilities
Making the changes we get slightly different behavior:
:- use_module(library(clpfd)).
connectRow([], _, 0).
connectRow([spot(_, R, _, _)|Spots], R, K) :-
K #> 0, K1 #= K-1, connectRow(Spots, R, K1).
?- connectRow(X, Y, 0).
X = [] ;
false.
?- connectRow(X, Y, 1).
X = [spot(_G906, Y, _G908, _G909)] ;
false.
?- connectRow(X, Y, Z).
X = [],
Z = 0 ;
X = [spot(_G918, Y, _G920, _G921)],
Z = 1 ;
X = [spot(_G918, Y, _G920, _G921), spot(_G1218, Y, _G1220, _G1221)],
Z = 2
You'll note that in the result we have Y standing in our spot structures, but we have weird looking automatically generated variables in the other positions, such as _G918. As it happens, we could use _ instead of Y and see a similar effect:
?- connectRow(X, _, Z).
X = [],
Z = 0 ;
X = [spot(_G1269, _G1184, _G1271, _G1272)],
Z = 1 ;
X = [spot(_G1269, _G1184, _G1271, _G1272), spot(_G1561, _G1184, _G1563, _G1564)],
Z = 2
All of these strange looking variables are there because we used _. Note that all of the spot structures have the exact same generated variable in the second position, because Prolog was told it had to unify the second parameter of connectRow with the second position of spot. It's true everywhere because R is "passed along" to the next call to connectRow, recursively.
Hopefully this helps explain what's going on with the _ in your example, and also Prolog unification in general.
Edit: Unifying something with R
To answer your question below, you can unify R with a value directly, or by binding it to a variable and using the variable. For instance, we can bind it directly:
?- connectRow(X, 'Hello, world!', 2).
X = [spot(_G275, 'Hello, world!', _G277, _G278), spot(_G289, 'Hello, world!', _G291, _G292)]
We can also bind it and then assign it later:
?- connectRow(X, R, 2), R='Neato'.
X = [spot(_G21, 'Neato', _G23, _G24), spot(_G29, 'Neato', _G31, _G32)],
R = 'Neato'
There's nothing special about saying R=<foo>; it unifies both sides of the expression, but both sides can be expressions rather than variables:
?- V = [2,3], [X,Y,Z] = [1|V].
V = [2, 3],
X = 1,
Y = 2,
Z = 3.
So you can use R in another predicate just as well:
?- connectRow(X, R, 2), append([1,2], [3,4], R).
X = [spot(_G33, [1, 2, 3, 4], _G35, _G36), spot(_G41, [1, 2, 3, 4], _G43, _G44)],
R = [1, 2, 3, 4] ;
Note that this creates opportunities for backtracking and generating other solutions. For instance:
?- connectRow(X, R, 2), length(R, _).
X = [spot(_G22, [], _G24, _G25), spot(_G30, [], _G32, _G33)],
R = [] ;
X = [spot(_G22, [_G35], _G24, _G25), spot(_G30, [_G35], _G32, _G33)],
R = [_G35] ;
X = [spot(_G22, [_G35, _G38], _G24, _G25), spot(_G30, [_G35, _G38], _G32, _G33)],
R = [_G35, _G38] ;
Hope this helps!

Resources