Cryptogram Puzzle with Prolog CLPFD - prolog

I recently found a small game on the Google Play app store called Cryptogram. There are dozens of apps similar to this one. The idea is to match the number to the colors such that all of the equations sound true.
I was able to get through problems 1-8 and problem 10 fairly quickly by hand, but problem 9 has proven to be more difficult for me.
Problem 9
After some time tinkering and guessing, I gave up and decided to program a solution. I have used Prolog/Datalog for some small tasks as an undergrad as well as some Project Euler problems. Previously I had seen the 15 line Sudoku solver that uses Prolog's Constraint Logic Programming over Finite Domains (clpfd) library, and I decided to give it a go myself. I'm using SWI-Prolog.
:- use_module(library(clpfd)).
problem(Colors) :-
Colors = [Pink, Cyan, Yellow, Green, Purple, Red, Brown, White, Lime],
Colors ins 0..9,
all_distinct(Colors),
% The leading digit of a number can't be 0
Pink #\= 0,
Red #\= 0,
White #\= 0,
Green #\= 0,
Lime #\= 0,
Cyan #\= 0,
% I originally tried to write a predicate generalizing numbers and a list of digits
% but got in way over my head with CLPFD.
Number1_1 #= (Pink * 1000) + (Cyan * 100) + (Pink * 10) + Yellow,
Number1_2 #= (Green * 10) + Purple,
Number1_3 #= (Cyan * 100) + (Red * 10) + Purple,
Number2_1 #= (Red * 1000) + (Brown * 100) + (White * 10) + Red,
Number2_2 #= (Lime * 10) + Yellow,
Number2_3 #= (Red * 1000) + (Lime * 100) + (Purple * 10) + Pink,
Number3_1 #= (White * 1000) + (Purple * 100) + (Cyan * 10) + White,
Number3_2 #= (Green * 1000) + (Cyan * 100) + (Yellow * 10) + Purple,
Number3_3 #= (Cyan * 1000) + (Red * 100) + (Yellow * 10) + Red,
% I'm not 100% sure whether to use floored or truncated division here.
% I thought the difference would be a float vs integer output,
% but that doesn't make sense with finite domains.
Number1_1 // Number1_2 #= Number1_3,
Number1_1 rem Number1_2 #= 0,
Number2_3 #= Number2_1 + Number2_2,
Number3_3 #= Number3_1 - Number3_2,
Number3_1 #= Number1_1 - Number2_1,
Number3_2 #= Number1_2 * Number2_2,
Number3_3 #= Number1_3 + Number2_3.
The output when I run this query in SWI-Prolog makes me feel like I'm misunderstanding a big concept in CLPFD:
?- problem([Pink, Cyan, Yellow, Green, Purple, Red, Brown, White, Lime]).
Pink in 3..9,
_7756#=Pink+10*Purple+1000*Red+100*Lime,
_7810#=1010*Pink+100*Cyan+Yellow,
all_distinct([Pink, Cyan, Yellow, Green, Purple, Red, Brown, White|...]),
Cyan in 1..7,
_7946#=1000*Cyan+10*Yellow+101*Red,
_7994#=100*Cyan+10*Yellow+1000*Green+Purple,
_8048#=10*Cyan+100*Purple+1001*White,
_8096#=100*Cyan+Purple+10*Red,
Yellow in 0..9,
_8162#=Yellow+10*Lime,
Green in 1..7,
_8216#=10*Green+Purple,
Purple in 0..9,
Red in 1..7,
_8294#=1001*Red+100*Brown+10*White,
Brown in 0..9,
White in 2..8,
Lime in 1..9,
_7756 in 1103..7568,
_8096+_7756#=_7946,
_8294+_8162#=_7756,
_8096 in 110..779,
_7810//_8216#=_8096,
_7810 in 3334..9799,
_8048+_8294#=_7810,
_7810 rem _8216#=0,
_8048 in 2313..8778,
_7946+_7994#=_8048,
_7946 in 1213..7678,
_7994 in 1100..7565,
_8216*_8162#=_7994,
_8216 in 12..79,
_8162 in 14..99,
_8294 in 1021..7486.
I would expect each color in the color list to bind to a single distinct integer in the range 0..9, but that's not what's happening. Can you help me find the solution to this problem?
EDIT
So I picked an arbitrary color and started assigning it numbers in the range that the constraint says should be valid. I ran this query with Cyan bound to 1.
?- problem([Pink, 1, Yellow, Green, Purple, Red, Brown, White, Lime]).
false.
Which doesn't make sense. The previous "output" says "Cyan in 1..7", which I thought meant that any value in that range is valid. However, if I pick another arbitrary value for Cyan:
?- problem([Pink, 2, Yellow, Green, Purple, Red, Brown, White, Lime]).
Pink = 7,
Yellow = 6,
Green = 3,
Purple = 4,
Red = 1,
Brown = 8,
White = 5,
Lime = 9.
I get the answer I was looking for. Though the Cryptogram is solved, I still don't understand why Prolog's CLPFD library didn't find it completely independently.
EDIT 2
I used your suggestions to clean up the code. I also reintroduced the predicate which relates digits to numbers. This code chunk works perfectly.
:- use_module(library(clpfd)).
digit_number(0, [], 1).
digit_number(Number, [Digit|Tail], DigitPlace) :-
digit_number(NextNumber, Tail, NextDigitPlace),
DigitPlace #= NextDigitPlace * 10,
PlaceNumber #= Digit * (NextDigitPlace),
Number #= PlaceNumber + NextNumber.
digit_number(Number, ColorList) :-
digit_number(Number, ColorList, _).
problem(Colors) :-
Colors = [Pink, Cyan, Yellow, Green, Purple, Red, Brown, White, Lime],
Colors ins 0..9,
all_distinct(Colors),
digit_number(Number1_1, [Pink, Cyan, Pink, Yellow]),
digit_number(Number1_2, [Green, Purple]),
digit_number(Number1_3, [Cyan, Red, Purple]),
digit_number(Number2_1, [Red, Brown, White, Red]),
digit_number(Number2_2, [Lime, Yellow]),
digit_number(Number2_3, [Red, Lime, Purple, Pink]),
digit_number(Number3_1, [White, Purple, Cyan, White]),
digit_number(Number3_2, [Green, Cyan, Yellow, Purple]),
digit_number(Number3_3, [Cyan, Red, Yellow, Red]),
Number1_1 // Number1_2 #= Number1_3,
Number1_1 rem Number1_2 #= 0,
Number2_1 + Number2_2 #= Number2_3,
Number3_1 - Number3_2 #= Number3_3,
Number1_1 - Number2_1 #= Number3_1,
Number1_2 * Number2_2 #= Number3_2,
Number1_3 + Number2_3 #= Number3_3,
label(Colors).

Your code works, just add label(C) :
?- problem(C), label(C).
C = [7, 2, 6, 3, 4, 1, 8, 5, 9] .

The other answer shows you one way of getting the result you want, but I would like to answer some of your questions.
I still don't understand why Prolog's CLPFD library didn't find it completely independently.
Prolog is a more-or-less declarative programming language, but (although we like to pretend, for propaganda reasons) you cannot just write down anything that is logically equivalent to your problem and expect it to be executed correctly and efficiently. In particular, the order of execution of different goals matters a lot, even though it should make no logical difference. This is especially true for arithmetic. Consider:
?- between(1, 99999999, N), N > 99999998.
N = 99999999. % correct but slooooow
?- N > 99999998, between(1, 99999999, N).
ERROR: >/2: Arguments are not sufficiently instantiated
Doing the same with CLP(FD) works much more nicely:
?- N in 1..99999999, N #> 99999998.
N = 99999999. % correct and fast!
?- N #> 99999998, N in 1..99999999.
N = 99999999. % also correct, also fast!
CLP(FD) allows you to write programs that are more correct, more declarative, and that can often be more efficient than other solutions, unless you hand-optimize them.
To achieve this, unlike normal Prolog, CLP(FD) separates the collection of constraints from the actual search for solutions. As your program goes along and creates constraints, CLP(FD) will make some simplifications, like in your example where it determines Cyan in 1..7 on its own, or in my example above where it can find the unique solution immediately. But in general, these simplifications do not solve the problem completely.
One reason for this is, simply, performance: Search can be slow. It can be faster if more constraints are known, because new constraints on already constrainted variables can only make the search space smaller, but never bigger! It makes sense to delay it until concrete answers are actually needed.
For this reason, to actually get concrete resuls, you need to call a labeling predicate that systematically enumerates solutions. In SWI-Prolog, simple ones are indomain/1 and label/1; a general one is labeling/2. This latter one even allows you to influence the search space exploration strategy, which can be useful if you have some understanding of the problem domain.
The previous "output" says "Cyan in 1..7", which I thought meant that any value in that range is valid.
Not quite: It means that if there is a valid solution for Cyan, then it is in the range 1 to 7. It doesn't give a guarantee that all values in that range are solutions. For example:
?- X in 1..5, Y in 1..5, X #< Y.
X in 1..4,
X#=<Y+ -1,
Y in 2..5.
3 is in the range 1..4, and 3 is in the range 2..5, so purely based on this we might expect a solution with X = 3 and Y = 3. But that is impossible due to the additional constraint. Only labeling will actually give you answers that are guaranteed solutions, and only if you label all the variables in the query.
See also the very nice answer here: https://stackoverflow.com/a/27218564/4391743
Edit:
% I'm not 100% sure whether to use floored or truncated division here.
% I thought the difference would be a float vs integer output,
% but that doesn't make sense with finite domains.
Number1_1 // Number1_2 #= Number1_3,
Indeed fractional division doesn't make sense here, but Prolog would have told you:
?- X in 1..5, Y in 1..5, Z #= X // Y.
X in 1..5,
X//Y#=Z,
Y in 1..5,
Z in 0..5.
?- X in 1..5, Y in 1..5, Z #= X / Y.
ERROR: Domain error: `clpfd_expression' expected, found `_G6388/_G6412'

Related

Optimized CLP(FD) solver for number board puzzle

Consider the problem from https://puzzling.stackexchange.com/questions/20238/explore-the-square-with-100-hops:
Given a grid of 10x10 squares, your task is to visit every square exactly once. In each step, you may
skip 2 squares horizontally or vertically or
skip 1 square diagonally
In other words (closer to my implementation below), label a 10x10 grid with numbers from 1 to 100 such that each square at coordinates (X, Y) is 1 or is equal to one more than the "previous" square at (X, Y-3), (X, Y+3), (X-3, Y), (X+3, Y), (X-2, Y-2), (X-2, Y+2), (X+2, Y-2), or (X+2, Y+2).
This looks like a straightforward constraint programming problem, and Z3 can solve it in 30 seconds from a simple declarative specification: https://twitter.com/johnregehr/status/1070674916603822081
My implementation in SWI-Prolog using CLP(FD) does not scale quite as nicely. In fact, it cannot even solve the 5x5 instance of the problem unless almost two rows are pre-specified:
?- number_puzzle_(_Square, Vars), Vars = [1,24,14,2,25, 16,21,5,8,20 |_], time(once(labeling([], Vars))).
% 10,063,059 inferences, 1.420 CPU in 1.420 seconds (100% CPU, 7087044 Lips)
_Square = square(row(1, 24, 14, 2, 25), row(16, 21, 5, 8, 20), row(13, 10, 18, 23, 11), row(4, 7, 15, 3, 6), row(17, 22, 12, 9, 19)),
Vars = [1, 24, 14, 2, 25, 16, 21, 5, 8|...].
?- number_puzzle_(_Square, Vars), Vars = [1,24,14,2,25, 16,21,5,8,_ |_], time(once(labeling([], Vars))).
% 170,179,147 inferences, 24.152 CPU in 24.153 seconds (100% CPU, 7046177 Lips)
_Square = square(row(1, 24, 14, 2, 25), row(16, 21, 5, 8, 20), row(13, 10, 18, 23, 11), row(4, 7, 15, 3, 6), row(17, 22, 12, 9, 19)),
Vars = [1, 24, 14, 2, 25, 16, 21, 5, 8|...].
?- number_puzzle_(_Square, Vars), Vars = [1,24,14,2,25, 16,21,5,_,_ |_], time(once(labeling([], Vars))).
% 385,799,962 inferences, 54.939 CPU in 54.940 seconds (100% CPU, 7022377 Lips)
_Square = square(row(1, 24, 14, 2, 25), row(16, 21, 5, 8, 20), row(13, 10, 18, 23, 11), row(4, 7, 15, 3, 6), row(17, 22, 12, 9, 19)),
Vars = [1, 24, 14, 2, 25, 16, 21, 5, 8|...].
(This is on an oldish machine with SWI-Prolog 6.0.0. On a newer machine with SWI-Prolog 7.2.3 it runs about twice as fast, but that's not enough to beat the apparent exponential complexity.)
The partial solution used here is from https://www.nurkiewicz.com/2018/09/brute-forcing-seemingly-simple-number.html
So, my question: How can I speed up the following CLP(FD) program?
Additional question for extra thanks: Is there a specific labeling parameter that speeds up this search significantly, and if so, how could I make an educated guess at which one it might be?
:- use_module(library(clpfd)).
% width of the square board
n(5).
% set up a term square(row(...), ..., row(...))
square(Square, N) :-
length(Rows, N),
maplist(row(N), Rows),
Square =.. [square | Rows].
row(N, Row) :-
functor(Row, row, N).
% Entry is the entry at 1-based coordinates (X, Y) on the board. Fails if X
% or Y is an invalid coordinate.
square_coords_entry(Square, (X, Y), Entry) :-
n(N),
0 < Y, Y =< N,
arg(Y, Square, Row),
0 < X, X =< N,
arg(X, Row, Entry).
% Constraint is a CLP(FD) constraint term relating variable Var and the
% previous variable at coordinates (X, Y). X and Y may be arithmetic
% expressions. If X or Y is an invalid coordinate, this predicate succeeds
% with a trivially false Constraint.
square_var_coords_constraint(Square, Var, (X, Y), Constraint) :-
XValue is X,
YValue is Y,
( square_coords_entry(Square, (XValue, YValue), PrevVar)
-> Constraint = (Var #= PrevVar + 1)
; Constraint = (0 #= 1) ).
% Compute and post constraints for variable Var at coordinates (X, Y) on the
% board. The computed constraint expresses that Var is 1, or it is one more
% than a variable located three steps in one of the cardinal directions or
% two steps along a diagonal.
constrain_entry(Var, Square, X, Y) :-
square_var_coords_constraint(Square, Var, (X - 3, Y), C1),
square_var_coords_constraint(Square, Var, (X + 3, Y), C2),
square_var_coords_constraint(Square, Var, (X, Y - 3), C3),
square_var_coords_constraint(Square, Var, (X, Y + 3), C4),
square_var_coords_constraint(Square, Var, (X - 2, Y - 2), C5),
square_var_coords_constraint(Square, Var, (X + 2, Y - 2), C6),
square_var_coords_constraint(Square, Var, (X - 2, Y + 2), C7),
square_var_coords_constraint(Square, Var, (X + 2, Y + 2), C8),
Var #= 1 #\/ C1 #\/ C2 #\/ C3 #\/ C4 #\/ C5 #\/ C6 #\/ C7 #\/ C8.
% Compute and post constraints for the entire board.
constrain_square(Square) :-
n(N),
findall(I, between(1, N, I), RowIndices),
maplist(constrain_row(Square), RowIndices).
constrain_row(Square, Y) :-
arg(Y, Square, Row),
Row =.. [row | Entries],
constrain_entries(Entries, Square, 1, Y).
constrain_entries([], _Square, _X, _Y).
constrain_entries([E|Es], Square, X, Y) :-
constrain_entry(E, Square, X, Y),
X1 is X + 1,
constrain_entries(Es, Square, X1, Y).
% The core relation: Square is a puzzle board, Vars a list of all the
% entries on the board in row-major order.
number_puzzle_(Square, Vars) :-
n(N),
square(Square, N),
constrain_square(Square),
term_variables(Square, Vars),
Limit is N * N,
Vars ins 1..Limit,
all_different(Vars).
First of all:
What is going on here?
To see what is happening, here are PostScript definitions that let us visualize the search:
/n 5 def
340 n div dup scale
-0.9 0.1 translate % leave room for line strokes
/Palatino-Roman 0.8 selectfont
/coords { n exch sub translate } bind def
/num { 3 1 roll gsave coords 0.5 0.2 translate
5 string cvs dup stringwidth pop -2 div 0 moveto show
grestore } bind def
/clr { gsave coords 1 setgray 0 0 1 1 4 copy rectfill
0 setgray 0.02 setlinewidth rectstroke grestore} bind def
1 1 n { 1 1 n { 1 index clr } for pop } for
These definitions give you two procedures:
clr to clear a square
num to show a number on a square.
For example, if you save these definitions to tour.ps and then invoke the PostScript interpreter Ghostscript with:
gs -r72 -g350x350 tour.ps
and then enter the following instructions:
1 2 3 num
1 2 clr
2 3 4 num
you get:
PostScript is a great programming language for visualizing search processes, and I also recommend to check out postscript for more information.
We can easily modify your program to emit suitable PostScript instructions that let us directly observe the search. I highlight the relevant additions:
constrain_entries([], _Square, _X, _Y).
constrain_entries([E|Es], Square, X, Y) :-
freeze(E, postscript(X, Y, E)),
constrain_entry(E, Square, X, Y),
X1 #= X + 1,
constrain_entries(Es, Square, X1, Y).
postscript(X, Y, N) :- format("~w ~w ~w num\n", [X,Y,N]).
postscript(X, Y, _) :- format("~w ~w clr\n", [X,Y]), false.
I have also taken the liberty to change (is)/2 to (#=)/2 to make the program more general.
Assuming that you saved the PostScript definitions in tour.ps and your Prolog program in tour.pl, the following invocation of SWI-Prolog and Ghostscript illustrates the situation:
swipl -g "number_puzzle_(_, Vs), label(Vs)" tour.pl | gs -g350x350 -r72 tour.ps -dNOPROMPT
For example, we see a lot of backtracking at the highlighted position:
However, essential problems already lie completely elsewhere:
None of the highlighted squares are valid moves!
From this, we see that your current formulation does not—at least not sufficiently early—let the solver recognize when a partial assignment cannot be completed to a solution! This is bad news, since failure to recognize inconsistent assignments often leads to unacceptable performance. For example, in order to correct the 1 &rightarrow; 3 transition (which can never occur in this way, yet is already one of the first choices made in this case), the solver would have to backtrack over approximately 8 squares, after enumerating—as a very rough estimate—258 = 152587890625 partial solutions, and then start all over at only the second position in the board.
In the constraint literature, such backtracking is called thrashing. It means repeated failure due to the same reason.
How is this possible? Your model seems to be correct, and can be used to detect solutions. That's good! However, a good constraint formulation not only recognizes solutions, but also quickly detects partial assignments that cannot be completed to solutions. This is what allows the solver to effectively prune the search, and it is in this important respect that your current formulation falls short. One of the reasons for this has to do with constraint propagation in reified constraints that you are using. In particular, consider the following query:
?- (X + 1 #= 3) #<==> B, X #\= 2.
Intuitively, we expect B = 0. But that is not the case! Instead, we get:
X in inf..1\/3..sup,
X+1#=_3840,
_3840#=3#B,
B in 0..1.
So, the solver does not propagate reified equality very strongly. Maybe it should though! Only sufficient feedback from Prolog practitioners will tell whether this area of the constraint solver should be changed, possibly trading a bit of speed for stronger propagation. The high relevance of this feedback is one of the reasons why I recommend to use CLP(FD) constraints whenever you have the opportunity, i.e., every time you are reasoning about integers.
For this particular case, I can tell you that making the solver stronger in this sense does not make that much of a difference. You essentially end up with a version of the board where the core issue still arises, with many transitions (some of them highlighted below) that cannot occur in any solution:
Fixing the core issue
We should eliminate the cause of the backtracking at its core. To prune the search, we must recognize inconsistent (partial) assignments earlier.
Intuitively, we are searching for a connected tour, and want to backtrack as soon as it is clear that the tour cannot be continued in the intended way.
To accomplish what we want, we have at least two options:
change the allocation strategy to take connectedness into account
model the problem in such a way that connectedness is more strongly taken into account.
Option 1: Allocation strategy
A major attraction of CLP(FD) constraints is that they let us decouple the task description from the search. When using CLP(FD) constraints, we often perform search via label/1 or labeling/2. However, we are free to assign values to variables in any way we want. This is very easy if we follow—as you have done—the good practice of putting the "constraint posting" part into its own predicate, called the core relation.
For example, here is a custom allocation strategy that makes sure that the tour remains connected at all times:
allocation(Vs) :-
length(Vs, N),
numlist(1, N, Ns),
maplist(member_(Vs), Ns).
member_(Es, E) :- member(E, Es).
With this strategy, we get a solution for the 5×5 instance from scratch:
?- number_puzzle_(Square, Vars), time(allocation(Vars)).
% 5,030,522 inferences, 0.907 CPU in 0.913 seconds (99% CPU, 5549133 Lips)
Square = square(row(1, 8, 5, 2, 11), ...),
Vars = [1, 8, 5, 2, 11, 16, 21, 24, 15|...]
There are various modifications of this strategy that are worth trying out. For example, when multiple squares are admissible, we could try to make a more intelligent choice by taking into account the number of remaining domain elements of the squares. I leave trying such improvements as a challenge.
From the standard labeling strategies, the min labeling option is in effect quite similar to this strategy in this case, and indeed it also finds a solution for the 5×5 case:
?- number_puzzle_(Square, Vars), time(labeling([min], Vars)).
% 22,461,798 inferences, 4.142 CPU in 4.174 seconds (99% CPU, 5422765 Lips)
Square = square(row(1, 8, 5, 2, 11), ...),
Vars = [1, 8, 5, 2, 11, 16, 21, 24, 15|...] .
However, even a fitting allocation strategy cannot fully compensate weak constraint propagation. For the 10×10 instance, the board looks like this after some search with the min option:
Note that we also have to adapt the value of n in the PostScript code to visualize this as intended.
Ideally, we should formulate the task in such a way that we benefit from strong propagation, and then also use a good allocation strategy.
Option 2: Remodeling
A good CLP formulation propagates as strongly as possible (in acceptable time). We should therefore strive to use constraints that allow the solver to reason more readily about the task's most important requirements. In this concrete case, it means that we should try to find a more suitable formulation for what is currently expressed as a disjunction of reified constraints that, as shown above, do not allow much propagation. In theory, the constraint solver could recognize such patterns automatically. However, that is impractical for many use cases, and we therefore must sometimes experiment by manually trying several promising formulations. Still, also in this case: With sufficient feedback from application programmers, such cases are more likely to be improved and worked on!
I now use the CLP(FD) constraint circuit/1 to make clear that we are looking for a Hamiltonian circuit in a particular graph. The graph is expressed as a list of integer variables, where each element denotes the position of its successor in the list.
For example, a list with 3 elements admits precisely 2 Hamiltonian circuits:
?- Vs = [_,_,_], circuit(Vs), label(Vs).
Vs = [2, 3, 1] ;
Vs = [3, 1, 2].
I use circuit/1 to describe solutions that are also closed tours. This means that, if we find such a solution, then we can start again from the beginning via a valid move from the last square in the found tour:
n_tour(N, Vs) :-
L #= N*N,
length(Vs, L),
successors(Vs, N, 1),
circuit(Vs).
successors([], _, _).
successors([V|Vs], N, K0) :-
findall(Num, n_k_next(N, K0, Num), [Next|Nexts]),
foldl(num_to_dom, Nexts, Next, Dom),
V in Dom,
K1 #= K0 + 1,
successors(Vs, N, K1).
num_to_dom(N, D0, D0\/N).
n_x_y_k(N, X, Y, K) :- [X,Y] ins 1..N, K #= N*(Y-1) + X.
n_k_next(N, K, Next) :-
n_x_y_k(N, X0, Y0, K),
( [DX,DY] ins -2 \/ 2
; [DX,DY] ins -3 \/ 0 \/ 3,
abs(DX) + abs(DY) #= 3
),
[X,Y] ins 1..N,
X #= X0 + DX,
Y #= Y0 + DY,
n_x_y_k(N, X, Y, Next),
label([DX,DY]).
Note how admissible successors are now expressed as domain elements, reducing the number of constraints and entirely eliminating the need for reifications. Most importantly, the intended connectedness is now automatically taken into account and enforced at every point during the search. The predicate n_x_y_k/4 relates (X,Y) coordinates to list indices. You can easily adapt this program to other tasks (e.g., knight's tour) by changing n_k_next/3. I leave the generalization to open tours as a challenge.
Here are additional definitions that let us print solutions in a more readable form:
:- set_prolog_flag(double_quotes, chars).
print_tour(Vs) :-
length(Vs, L),
L #= N*N, N #> 0,
length(Ts, N),
tour_enumeration(Vs, N, Es),
phrase(format_string(Ts, 0, 4), Fs),
maplist(format(Fs), Es).
format_(Fs, Args, Xs0, Xs) :- format(chars(Xs0,Xs), Fs, Args).
format_string([], _, _) --> "\n".
format_string([_|Rest], N0, I) -->
{ N #= N0 + I },
"~t~w~", call(format_("~w|", [N])),
format_string(Rest, N, I).
tour_enumeration(Vs, N, Es) :-
length(Es, N),
maplist(same_length(Es), Es),
append(Es, Ls),
foldl(vs_enumeration(Vs, Ls), Vs, 1-1, _).
vs_enumeration(Vs, Ls, _, V0-E0, V-E) :-
E #= E0 + 1,
nth1(V0, Ls, E0),
nth1(V0, Vs, V).
In formulations with strong propagation, the predefined ff search strategy is often a good strategy. And indeed it lets us solve the whole task, i.e., the original 10×10 instance, within a few seconds on a commodity machine:
?- n_tour(10, Vs),
time(labeling([ff], Vs)),
print_tour(Vs).
% 5,642,756 inferences, 0.988 CPU in 0.996 seconds (99% CPU, 5710827 Lips)
1 96 15 2 97 14 80 98 13 79
93 29 68 94 30 27 34 31 26 35
16 3 100 17 4 99 12 9 81 45
69 95 92 28 67 32 25 36 33 78
84 18 5 83 19 10 82 46 11 8
91 23 70 63 24 37 66 49 38 44
72 88 20 73 6 47 76 7 41 77
85 62 53 86 61 64 39 58 65 50
90 22 71 89 21 74 42 48 75 43
54 87 60 55 52 59 56 51 40 57
Vs = [4, 5, 21, 22, 8, 3, 29, 26, 6|...]
For utmost performance, I recommend you also try this with other Prolog systems. The efficiency of commercial-grade CLP(FD) systems is often an important reason for buying a Prolog system.
Note also that this is by no means the only promising Prolog or even CLP(FD) formulation of the task, and I leave thinking about other formulations as a challenge.

PROLOG: process of inference, why returns false

I have a definition of conc:
conc([], L2, L2).
conc([X1|R1], L2, [X1|RN]) :-
conc(R1, L2, RN).
I don't understand why conc([X | green], Y, [red, green, blue]). returns false rather than
X = [red],
Y = [blue]
What is the process of inference here?
Disclaimer: I don't know Prolog. The rest of this answer is an edumacated guess.
Your proposed solution of X = [red] doesn't make sense because that would make X a one-element list. Let's assume
X = red
instead.
That would give us
conc([red | green], [blue], [red, green, blue]).
With the second equation of conc that turns into
conc(R1, L2, RN).
% with:
% X1 = red
% R1 = green
% L2 = [blue]
% [X1|RN] = [red, green, blue]
% i.e. X1 = red
% RN = [green, blue]
I.e.
conc(green, [blue], [green, blue]).
And now we're stuck because none of your conc rules applies to green.
The problem is [X | green] because green is not the tail of a list.
Did you mean [X, green] instead?
In Prolog list notation, the | separates the elements enumerated at its left from a list of the remaining elements at its right. The issue is in your query. Instead of [X | green] you need to write either [X | [green]] or [X,green]. With one of these fixes, you get correct answer. E.g.
?- conc([X | [green]], Y, [red, green, blue]).
X = red,
Y = [blue].

How to check the repeatable variables in a list in prolog

I'm working on this Prolog question where I have to design a program that creates a grid of wizard hats of 4 different colours (blue, red, green, and yellow) where each hat has one of 4 different letters (w, x, y, and z). The hats have to be arranged in such a way that no row or column has two hats with the same colour or hats with the same letter in them.
I have to write a predicate validRow that is true if a row is valid, i.e., if no two hats have the same color or the same letter.
ex:
?- validRow([(1, 1, red, w), (1, 2, green, x), (1, 3, yellow, y), (1, 4, blue, z)]).
true.
This is my code so far
validRow([(R,C1,Colour1,Letter1), (R,C2,Colour2,Letter2),(R,C3,Colour3,Letter3), (R,C4,Colour4,Letter4)]) :-
isValid([Colour1,Colour2,Colour3,Colour4], [Letter1,Letter2,Letter3,Letter4]).
isValid([HC|TC],[HL|TL]) :-
not(member(HC,TC)),
not(member(HL,TL)),
isValid(TC,TL).
This doesn't give me the correct answer. How can i fix this?
Recursion requires a base case; you have not specified one for isValid.

Weight of marbles in a bag

So we just started learning prolog and I am having a hard time wrapping my head around prolog. We had this problem for homework and I just have no idea where to start.
Define a rule weight(R, O, Y, G, B) to find out how many marbles of each color can be contained in each bag, where R, O, Y, G, and B are the number of red, orange, yellow, green, blue marbles, respectively. The weight of the marbles is 1, 2, 3, 4, 5 grams respectively. Assuming the size of the bag is big enough to hold all the marbles with the total weight of 30 grams.
If you were going to solve this problem for just (say) yellow marbles, it would probably look like this:
weight(Marbles, Weight) :- Weight #= Marbles * 3.
Then if you wanted to see how many yellow marbles you have, you would query Prolog like so:
?- weight(Marbles, 30).
Marbles = 10.
Can you generalize from here? :)
Edit
Your problem can be solved by providing a domain.
weight(R,O,Y,G,B,Weight) :-
[R,O,Y,G,B] ins 0..sup,
Weight #= R + O*2 + Y*3 + G*4 + B*5.
When you call the predicate, you will get answers in the form of domains. This is usually not what I expect, so calling label/1 will generate solutions.
?- weight(R,O,Y,G,B,30), label([R,O,Y,G,B]).
R = O, O = Y, Y = G, G = 0,
B = 6 ;
R = O, O = Y, Y = 0,
G = 5,
B = 2 ;
etc.

Solving the Zebra puzzle (aka. Einstein puzzle) using the clpfd Prolog library

I have been given an exercise to solve the zebra puzzle using a constraint solver of my choice, and I tried it using the Prolog clpfd library.
I am aware that there are other more idiomatic ways to solve this problem in Prolog, but this question is specifically about the clpfd package!
So the specific variation of the puzzle (given that there are many of them) I'm trying to solve is this one:
There are five houses
The Englishman lives in the red house
The Swedish own a dog
The Danish likes to drink tea
The green house is left to the white house
The owner of the green house drinks coffee
The person that smokes Pall Mall owns a bird
Milk is drunk in the middle house
The owner of the yellow house smokes Dunhill
The norwegian lives in the first house
The marlboro smoker lives next to the cat owner
The horse owner lives next to the person who smokes dunhill
The winfield smoker likes to drink beer
The norwegian lives next to the blue house
The german smokes rothmanns
The marlboro smoker has a neighbor who drinks water
I tried to solve it with the following approach:
Each attribute a house can have is modeled as a variable, e.g. "British",
"Dog", "Green", etc. The attributes can take values from 1 to 5, depending on the house
in which they occur, e.g. if the variable "Dog" takes the value 3, the dog lives in the
third house.
This approach makes it easy to model neighbor constraints like this:
def neighbor(X, Y) :-
(X #= Y-1) #\/ (X #= Y+1).
But somehow, the clpfd package does not yield a solution, even though (IMO) the problem is modeled correctly (I used the exact same model with the Choco constraint solver and the result was correct).
Here is the complete code:
:- use_module(library(clpfd)).
neighbor(X, Y) :-
(X #= (Y - 1)) #\/ (X #= (Y + 1)).
solve([British, Swedish, Danish, Norwegian, German], Fish) :-
Nationalities = [British, Swedish, Danish, Norwegian, German],
Colors = [Red, Green, Blue, White, Yellow],
Beverages = [Tea, Coffee, Milk, Beer, Water],
Cigarettes = [PallMall, Marlboro, Dunhill, Winfield, Rothmanns],
Pets = [Dog, Bird, Cat, Horse, Fish],
all_different(Nationalities),
all_different(Colors),
all_different(Beverages),
all_different(Cigarettes),
all_different(Pets),
Nationalities ins 1..5,
Colors ins 1..5,
Beverages ins 1..5,
Cigarettes ins 1..5,
Pets ins 1..5,
British #= Red, % Hint 1
Swedish #= Dog, % Hint 2
Danish #= Tea, % Hint 3
Green #= White - 1 , % Hint 4
Green #= Coffee, % Hint 5,
PallMall #= Bird, % Hint 6
Milk #= 3, % Hint 7
Yellow #= Dunhill, % Hint 8,
Norwegian #= 1, % Hint 9
neighbor(Marlboro, Cat), % Hint 10
neighbor(Horse, Dunhill), % Hint 11
Winfield #= Beer, % Hint 12
neighbor(Norwegian, Blue), % Hint 13
German #= Rothmanns, % Hint 14,
neighbor(Marlboro, Water). % Hint 15
Did I misunderstand a concept within clpfd, or am I simply missing something obvious here? In case it helps, here you can find the same approach implemented using Choco and Scala.
Edit: The reason why I believe that the solver isn't able to solve the problem ist that it never comes up with definite values for the variables, but only with ranges, e.g. "Fish 1..3\/5".
There are several misconceptions here: You state "the clpfd package does not yield a solution", but actually it does yield one:
?- solve(Ls, Fish), label(Ls).
Ls = [3, 5, 2, 1, 4],
Fish in 1\/4,
all_different([5, 3, _G3699, 2, Fish]),
_G3699 in 1\/4,
_G3699+1#=_G3727,
_G3741+1#=_G3699,
_G3727 in 2\/4..5,
2#=_G3727#<==>_G3766,
_G3766 in 0..1,
_G3792#\/_G3766#<==>1,
_G3792 in 0..1,
2#=_G3741#<==>_G3792,
_G3741 in 0\/2..3.
So we know that if there is a solution, then Fish is either 1 or 4. Let's try 1:
?- solve(Ls, Fish), label(Ls), Fish = 1.
false.
No. So let's try 4:
?- solve(Ls, Fish), label(Ls), Fish = 4.
Ls = [3, 5, 2, 1, 4],
Fish = 4.
This works and is a ground solution to the problem. You can get it in a different way for example by including Fish in the variables that are to be labeled:
?- solve(Ls, Fish), label([Fish|Ls]).
Ls = [3, 5, 2, 1, 4],
Fish = 4 ;
false.
The purpose of labeling is exactly to try concrete values for constrained variables, independent of whether there actually is a solution. By coincidence, all_distinct/1 is strong enough to yield a ground solution by itself in this case, but in general this is of course not the case and you must eventually use labeling to obtain an unconditional (i.e., no more pending constraints) answer. Of course you must then in general also label all variables that are of interest to you, not just a subset of them as you did initially. To label a single variable, you can use indomain/1, so appending indomain(Fish) to the first query above would also work. I could not reproduce the instantiation error you mentioned in a further comment, in fact as you see above the most general query solve(X, Y) works with the code you posted. Finally, check this out:
neighbor(X, Y) :- abs(X-Y) #= 1.
running your code in SWI-Prolog, I get
?- solve(X),label(X).
X = [3, 5, 2, 1, 4].
Without label:
?- solve(X).
X = [3, _G3351, _G3354, 1, _G3360],
_G3351 in 4..5,
all_different([_G3351, _G3386, _G3389, 2, _G3395]),
all_different([3, _G3351, _G3354, 1, _G3360]),
_G3386 in 3..5,
all_different([_G3386, _G3444, 1, _G3450, _G3360]),
_G3389 in 1\/3..5,
_G3389+1#=_G3478,
_G3492+1#=_G3389,
_G3395 in 1\/3..5,
_G3478 in 2..6,
_G3444#=_G3478#<==>_G3529,
_G3444 in 2..5,
_G3444#=_G3556#<==>_G3553,
_G3444#=_G3568#<==>_G3565,
_G3444#=_G3492#<==>_G3577,
_G3450 in 2\/5,
all_different([_G3354, 4, 3, _G3450, _G3614]),
_G3360 in 2\/4..5,
_G3354 in 2\/5,
_G3614 in 1..2\/5,
_G3614+1#=_G3556,
_G3568+1#=_G3614,
_G3556 in 2..3\/6,
_G3553 in 0..1,
_G3565#\/_G3553#<==>1,
_G3565 in 0..1,
_G3568 in 0..1\/4,
_G3492 in 0..4,
_G3577 in 0..1,
_G3577#\/_G3529#<==>1,
_G3529 in 0..1.
If I change all_different to all_distinct I get the solution without label:
....
all_distinct(Nationalities),
all_distinct(Colors),
all_distinct(Beverages),
all_distinct(Cigarettes),
all_distinct(Pets),
....
?- solve(X).
X = [3, 5, 2, 1, 4].
As you see, the docs state stronger propagation for all_distinct vs all_different. Running the proposed sample help to understand the difference between those:
?- maplist(in, Vs, [1\/3..4, 1..2\/4, 1..2\/4, 1..3, 1..3, 1..6]), all_distinct(Vs).
false.
?- maplist(in, Vs, [1\/3..4, 1..2\/4, 1..2\/4, 1..3, 1..3, 1..6]), all_different(Vs).
Vs = [_G419, _G422, _G425, _G428, _G431, _G434],
_G419 in 1\/3..4,
all_different([_G419, _G422, _G425, _G428, _G431, _G434]),
_G422 in 1..2\/4,
_G425 in 1..2\/4,
_G428 in 1..3,
_G431 in 1..3,
_G434 in 1..6.

Resources