Solving "Feed the Golorp" puzzle in Prolog - prolog

Some time ago I created a problem for Codeforces April Fools Day Contest 2014 - "Feed the Golorp": http://codeforces.com/contest/409/problem/I.
Please read the problem statement on the link provided.
The problem was intended to be solved by people who don't know Prolog at all. Only 3 persons managed to solve the problem during the contest - in Java, Python and C++.
The main challenge is to understand what's need to be done. For a person with some Prolog experience it should be almost obvious
that golorp's name like ?(_-_/___*__):-___>__. defines a Prolog predicate, and the task is to find minimal values of variables such that the predicates satisfied.
There are some details: again, please read the problem statement.
Actually solving the problem after understanding the goal is not so trivial. Algorithmically the task is to topologically sort the variables according to constraints.
Golorp's name can be up to 1024 characters long, so decently efficient algorithm is needed.
I wrote my reference solution in Python with regular expressions. But after the contest I started to wonder how to solve the problem in Prolog.
Because of the possible length of the golorp's name up to 1024 characters using just Prolog backtracking to bruteforce all the possibilities doesn't look feasible -
constraint logic programming is probably needed.
If I could extract list of all variables and list of pairs of variables from the inequalities, I can solve it. For example in ECLiPSe CLP:
:- lib(ic).
solve(Vars, Ineqs, Result) :-
Vars :: 0..9,
( foreach((A, B), Ineqs) do
A #< B ),
labeling(Vars),
concat_string(Vars, Result).
[eclipse]: Vars = [__, ___, __, ___], Ineqs = [(__, ___)], solve(Vars, Ineqs, Result).
Vars = [0, 1, 0, 1]
__ = 0
___ = 1
Ineqs = [(0, 1)]
Result = "0101"
But I'm not sure how to get Vars = [__, ___, __, ___] and Ineqs = [(__, ___)] from ?(__+___+__-___):-___>__. without too much code.
term_variables/2 loses duplicate variables. DCG?
Or is there completely different, better way to solve the puzzle in Prolog? (not necessarily in ECLiPSe CLP).
Update: couple of large examples to test:
?(_____________________*_________________________*________________________*___________________*_________________*__________________*___________________________*___*__*____________________*_________________________*_______________*____*___________*_____________*______*_____*_______________*____________*__________________*___________________________*___________________________):-_____>__,_______________<___________________,__>___________,________________________>______,_____________>______,____________________<_________________________,_________________<__________________,_____________<___,____<_________________________,______>____________,________________________>_________________________,_____<____________________,__<____________,_____________________>____________,__________________>_______________,_____>___,___________<_______________,_________________________>____,____<___________________,________________________>___________________________,____________>___________________________,_____<_______________.
Result: 3898080517870043672800
?(___*__*_____*____*_____*___*___*_____*___*___*___*__*___*_____*___*_____*____*___*____*_____*_____*____*_____*____*____*____*___*___*__*___*____*__*_____*_____*____*____*___*__*____*___*___*____*_____*_____*____*___*__*_____*____*__*_____*___*___*___*_____*____*___*_____*_____*___*___*___*____*__*_____*_____*__*___*__*__*_____*____*_____*___*__*_____*_____*__*____*___*____*_____*_____*___*___*___*_____*__*__*__*__*___*_____*__*___*___*____*_____*___*__*_____*_____*_____*_____*_____*__*__*___*___*_____*____*___*__*___*__*___*_____*__*_____*_____*_____*____*____*___*___*_____*____*____*__*__*_____*___*__*___*_____*_____):-____>_____,___>____.
Result: 2001022022202020121001011122021000112012210012001002220120022210000200010200001210022200000200221020000000022012020200000112201100020200

last edit: Since brute force based answer was inappropriate, as advised, here is the library(clpfd) based solution:
:- [library(clpfd)].
feed_the_golorp_clp(G, Food) :-
G = (?(F) :- C),
prepare(C, P),
term_variables(G, T),
T ins 0..9,
call(P),
label(T),
with_output_to(string(Food), yields(F)).
yields(E) :- E =.. [_,A,B] -> yields(A), yields(B) ; write(E).
prepare(C, P) :-
compound(C),
C =.. [O, A, B],
member((O, Op), [(<, #<), (>, #>), ((,), (,))]),
prepare(A, Pa),
prepare(B, Pb),
P =.. [Op, Pa, Pb].
prepare(C, C).
that works well on largest example, yielding "3898080517870043672800"...
Resume previous answer...
pure Prolog:
feed_the_golorp(G, F) :-
G = (_ :- B),
term_variables(G, F),
maplist(food, F),
call(B).
food(X) :- member(X, [0,1,2,3,4,5,6,7,8,9]).
easy, given your extensive explanation, but not so efficient...
?- time(feed_the_golorp(( ?(______________________/____+_______*__-_____*______-___):-__<___,___<____,____<_____,_____<______,______<_______ ), F)).
% 976,115 inferences, 0.874 CPU in 0.876 seconds (100% CPU, 1116785 Lips)
______________________ = __, __ = 0,
____ = 2,
_______ = 5,
_____ = 3,
______ = 4,
___ = 1,
F = [0, 2, 5, 0, 3, 4, 1]
.
edit I'd like a counterexample based on variables ordering, since I feel my code could be incomplete/incorrect...
Indeed, I completely missed the concatenation part...
feed_the_golorp(G, Food) :-
G = (?(F) :- C),
term_variables(G, T),
maplist(food, T),
call(C),
yields(F, S),
atomic_list_concat(S, Food).
food(X) :- member(X, [0,1,2,3,4,5,6,7,8,9]).
yields(C, [C]) :- number(C).
yields(E, S) :- E =.. [_,A,B], yields(A,Ca), yields(B,Cb), append(Ca,Cb,S).
now the result is more plausible
?- time(feed_the_golorp(( ?(___*__*_____*____*_____*___*___*_____*___*___*___*__*___*_____*___*_____*____*___*____*_____*_____*____*_____*____*____*____*___*___*__*___*____*__*_____*_____*____*____*___*__*____*___*___*____*_____*_____*____*___*__*_____*____*__*_____*___*___*___*_____*____*___*_____*_____*___*___*___*____*__*_____*_____*__*___*__*__*_____*____*_____*___*__*_____*_____*__*____*___*____*_____*_____*___*___*___*_____*__*__*__*__*___*_____*__*___*___*____*_____*___*__*_____*_____*_____*_____*_____*__*__*___*___*_____*____*___*__*___*__*___*_____*__*_____*_____*_____*____*____*___*___*_____*____*____*__*__*_____*___*__*___*_____*_____):-____>_____,___>____), F)).
% 17,806 inferences, 0.009 CPU in 0.010 seconds (94% CPU, 1968536 Lips)
___ = 2,
__ = _____, _____ = 0,
____ = 1,
F = '2001022022202020121001011122021000112012210012001002220120022210000200010200001210022200000200221020000000022012020200000112201100020200'
.
or, somewhat more compact and yielding output similar to example:
feed_the_golorp(G, Food) :-
G = (?(F) :- C),
term_variables(G, T),
maplist(food, T),
call(C),
with_output_to(string(Food), yields(F)).
food(X) :- member(X, [0,1,2,3,4,5,6,7,8,9]).
yields(E) :- E =.. [_,A,B] -> yields(A), yields(B) ; write(E).
but, with_output_to/2 it's a SWI-Prolog only utility...

This is an ECLiPSe solution that takes the Golorp description directly:
:- lib(ic).
golorp((?(Jaws):-Stomach), Food) :-
term_vars(Jaws, Xs, []),
Xs :: 0..9,
call(Stomach)#ic,
once labeling(Xs),
concat_string(Xs, Food).
term_vars(X, [X|Vs], Vs) :- var(X).
term_vars(Xs, Vs, Vs0) :- nonvar(Xs),
( foreacharg(X,Xs), fromto(Vs,Vs2,Vs1,Vs0) do
term_vars(X, Vs2, Vs1)
).
I've used a duplicate-preserving variant of term_variables/2, and exploited the fact that the ic constraint solver library defines constraint-versions of all the comparison predicates >/2, </2 etc. Sample run:
?- golorp((?(_-_/___*__):-___>__), Food).
___ = 1
__ = 0
Food = "0010"
Yes (0.00s cpu)

Related

What is the difference in execution if the cut '!' is present?

counter([],[]).
counter([H|T],[[H,C1]|R]) :- counter(T,[[H,C]|R]),!, C1 is C+1.
counter([H|T],[[H,1]|R]) :- counter(T,R).
What is the effect of the "!" as I'm getting the same output for an input in both the above and below code?
counter([],[]).
counter([H|T],[[H,C1]|R]) :- counter(T,[[H,C]|R]),C1 is C+1.
counter([H|T],[[H,1]|R]) :- counter(T,R).
I'm new to Prolog.
What is the effect of the "!"
The cut prunes the search space. That is, in an otherwise pure and monotonic program, the cut will remove some solutions or answers. As long as those are redundant that's fine. It sounds so innocent and useful, doesn't it? Let's have a look!
And lest I forget, using [E,Nr] to denote pairs is rather unusual, better use a pair E-Nr.
We will now compare counter_cut/2 and counter_sans/2.
| ?- counter_cut([a,a],Xs).
Xs = [[a,2]].
| ?- counter_sans([a,a],Xs).
Xs = [[a, 2]]
; Xs = [[a, 1], [a, 1]]. % <<< surprise !!!
So the cut-version has fewer solutions. Seems the solution counter_cut/2 retained is the right one. In this very particular case. Will it always take the right one? I will try a minimally more general query:
| ?- counter_cut([a,B],Xs).
B = a,
Xs = [[a, 2]].
| ?- counter_sans([a,B],Xs).
B = a,
Xs = [[a, 2]]
; Xs = [[a, 1], [B, 1]].
Again, _sans is chattier, and this time, it is even a bit right-er; for the last answer includes B = b. In other words,
| ?- counter_cut([a,B], Xs), B = b.
fails. % incomplete !
| ?- counter_sans([a,B], Xs), B = b.
B = b,
Xs = [[a,1],[b,1]].
So sometimes the _cut version is better, and sometimes _sans. Or to put more directly: Both are wrong somehow, but the _sans-version at least includes all solutions.
Here is a "purified" version, that simply rewrites the last rule into two different cases: One for the end of the list and the other for a further, different element.
counter_pure([],[]).
counter_pure([H|T],[[H,C1]|R]) :- counter_pure(T,[[H,C]|R]), C1 is C+1.
counter_pure([H],[[H,1]]).
counter_pure([H,D|T],[[H,1]|R]) :- dif(H,D), counter_pure([D|T],R).
From an efficiency viewpoint that is not too famous.
Here is a test case for efficiency for a system with rational tree unification:
?- Es = [e|Es], counter(Es, Dict).
resource_error(stack).
Instead, the implementation should loop smoothly, at least till the end of this universe. Strictly speaking, that query has to produce a resource error, but only after it has counted up to a number much larger than 10^100000000.
Here's my pure and hopefully efficient solution:
counter([X|L], C):- counter(L, X, 1, C).
counter([],X, Cnt, [[X,Cnt]]).
counter([Y|L], X, Cnt, [[X,Cnt]|C]):-
dif(X, Y),
counter(L, Y, 1, C).
counter([X|L],X, Cnt, [[X,XCnt]|C]):-
Cnt1 #= Cnt+1,
Cnt1 #=< XCnt,
counter(L, X, Cnt1, [[X,XCnt]|C]).
Using if_3 as suggested by #false:
counter([X|L], C):- counter(L, X, 1, C).
counter([],X, Cnt, [[X,Cnt]]).
counter([Y|L], X, Cnt, [[X,XCnt]|C]):-
if_(X=Y,
(
Cnt1 #= Cnt+1,
Cnt1 #=< XCnt,
counter(L, X, Cnt1, [[X,XCnt]|C])
),
(
XCnt=Cnt,
counter(L, Y, 1, C)
)
).
The cut operator ! commits to the current derivation path by pruning all choice points. Given some facts
fact(a).
fact(b).
you can compare the answers with and without cut:
?- fact(X).
X = a ;
X = b.
?- fact(X), !.
X = a.
As you can see, the general query now only reports its first success. Still, the query
?- fact(b), !.
true.
succeeds. This means, that cut violates the interpretation of , as logical conjunction:
?- X = b, fact(X), !.
X = b.
?- fact(X), !, X=b.
false.
but from our understanding of conjunction, A ∧ B should hold exactly when B ∧ A holds. So why do this at all?
Efficiency: cuts can be used such that they only change execution properties but not the answers of a predicate. These so called green cuts are for instance described in Richard O'Keefe's Craft of Prolog. As demonstrated above, maintaining correctness of a predicate with cut is much harder than one without, but obviously, correctness should come before efficiency.
It looks as if your problem was green, but I am not 100% sure if there is not a change in the answers.
Negation: logical negation according to the closed world assumption is expressed with cut. You can define neg(X) as:
neg(X) :-
call(X),
!,
false.
neg(_) :-
true.
So if call(X) succeeds, we cut the choice point for the second rule away and derive false. Otherwise, nothing is cut and we derive true. Please be aware that this is not negation in classical logic and that it suffers from the non-logical effects of cut. Suppose you define the predicate land/1 to be one of the continents:
land(africa).
land(america).
land(antarctica).
land(asia).
land(australia).
land(europe).
and then define water as everything not on land:
water(X) :-
neg(land(X)).
then you can correctly obtain:
?- water(pacific).
true.
?- water(africa).
false.
But you can also derive:
?- water(space).
true.
which should not hold. In particular, in classical logic:
land(africa) ∧
land(america) ∧
land(antarctica) ∧
land(asia) ∧
land(australia) ∧
land(europe) → ¬ land(space).
is not valid. Again, you should know well what you are doing if you use negation in Prolog.
Here is my attempt using if_/3:
counter([], []).
counter([H|T], [[H,C]|OutT] ):-
if_(
T=[],
(C = 1,OutT=[]),
(
[H|T] = [H,H1|T2],
if_(
H=H1,
(counter([H1|T2], [[H1,C1]|OutT]), C is C1+1),
(C = 1, counter([H1|T2], OutT))
)
)
).

Prolog unpacking lists predicate

I tried to create something what would work like this:
?- unpacking([[1], [1,2], [3]], Lst1, NewLst).
NewLst=[1,3]
I wrote it like this:
unpacking([], Lst1, Lst1).
unpacking([[H]|T], Lst1, NewLst):-
append([H], Lst2),
unpacking(T, Lst2, NewLst).
unpacking([_|T], Lst1, NewLst):-
unpacking(T, Lst1, NewLst).
and I know that I am doing something wrong. I am starting in Prolog so, need to learn from my mistakes :)
You probably meant:
unpacking([], []).
unpacking([[E]|T], [E|L]) :-
unpacking(T, L).
unpacking([[]|T], L) :-
unpacking(T, L).
unpacking([[_,_|_]|T], L) :-
unpacking(T, L).
There are more concise ways to write this - and more efficient, too.
What about this :
%?-unpacking([[a,b,c],[a],[b],[c,d]],Items).
unpacking(Lists,Items):-
my_tpartition(length_t(1),Lists,Items,Falses).
my_tpartition(P_2,List,Ts,Fs) :- my_tpartition_ts_fs_(List,Ts,Fs,P_2).
my_tpartition_ts_fs_([],[],[],_).
my_tpartition_ts_fs_([X|Xs0],Ts,Fs,P_2) :-
if_(call(P_2,X), (X=[NX],Ts = [NX|Ts0], Fs = Fs0),
(Ts = Ts0, Fs = [X|Fs0])),
my_tpartition_ts_fs_(Xs0,Ts0,Fs0,P_2).
length_t(X,Y,T):-
length(Y,L1),
=(X,L1,T).
This is based on Most general higher-order constraint describing a sequence of integers ordered with respect to a relation
* Update*
You could change to
length_t(X,Y,T):-
L1 #=< X,
fd_length(Y,L1),
=(X,L1,T),!.
length_t(_X,_Y,false).
fd_length(L, N) :-
N #>= 0,
fd_length(L, N, 0).
fd_length([], N, N0) :-
N #= N0.
fd_length([_|L], N, N0) :-
N1 is N0+1,
N #>= N1,
fd_length(L, N, N1).
giving:
?-unpacking([[1],[2,3],[4],[_,_|_]],U).
U= [1,4].
but:
?-unpacking([X],Xs).
X = Xs, Xs = [].
Based on #coder's solution, I made my own attempt using if_ and DCGs:
one_element_([], true).
one_element_([_|_],false).
one_element([], false).
one_element([_|Xs], T) :-
one_element_(Xs, T).
f([]) -->
[].
f([X|Xs]) -->
{ if_(one_element(X), Y=X, Y=[]) },
Y,
f(Xs).
unpack(Xs,Ys) :-
phrase(f(Xs),Ys).
I only tried for about 30s, but the queries:
?- Xs = [[] | Xs], unpack(Xs,Ys).
?- Xs = [[_] | Xs], unpack(Xs,Ys).
?- Xs = [[_, _ | _] | Xs], unpack(Xs,Ys).
didn't stop with a stack overflow. In my opinion, the critical one should be the last query, but apparently, SWI Prolog manages to optimize:
?- L = [_,_|_], one_element(L,T).
L = [_3162, _3168|_3170],
T = false.
Edit: I improved the solution and gave it a shot with argument indexing. According to the SWI Manual, indexing happens if there is exactly a case distinction between the empty list [] and the non-empty list [_|_]. I rewrote one_element such that it does exactly that and repeated the trick with the auxiliary predicate one_element_. Now that one_element is pure again, we don't lose solutions anymore:
?- unpack([A,B],[]).
A = [_5574, _5580|_5582],
B = [_5628, _5634|_5636] ;
A = [_5574, _5580|_5582],
B = [] ;
A = [],
B = [_5616, _5622|_5624] ;
A = B, B = [].
but
?- unpack([[a,b,c],[a],[b],[c,d]],Items).
Items = [a, b].
is still deterministic. I have not tried this solution in other Prologs, which might be missing the indexing, but it seems for SWI, this is a solution.
Update: Apparently GNU Prolog does not do this kind of indexing and overflows on cyclic lists:
| ?- Xs = [[] | Xs], unpack(Xs,Ys).
Fatal Error: global stack overflow (size: 32770 Kb, reached: 32768 Kb, environment variable used: GLOBALSZ)
After some thought, here is my implementation using if_/3:
unpacking(L,L1):-if_( =(L,[]), L1=[], unpack(L,L1)).
unpack([H|T],L):-if_(one_element(H), (H = [X],L=[X|T1],unpacking(T,T1)), unpacking(T,L)).
one_element(X, T) :-
( var(X) ->(T=true,X=[_]; T=false,X=[])
; X = [_] -> T = true
; X \= [_] -> T = false).
Some testcases:
?- unpacking([Xss],[]).
Xss = [].
?- unpacking([[1],[2,3],[4],[_,_|_]],U).
U = [1, 4].
?- unpacking([[1],[2,3],[4]],U).
U = [1, 4].
?- unpacking([[E]],[1]), E = 2.
false.
?- unpacking(non_list, []).
false.
?- unpacking([Xs],Xs).
Xs = [_G6221] ;
Xs = [].
UPDATE
To fix the case that #false referred in the comment we could define:
one_element([],false).
one_element([_],true).
one_element([_,_|_],false).
But this leaves some choice points...
One way to do it is with a findall I dont think its what the bounty is for though ;)
unpacking(Lists,L1):-
findall(I,(member(M,Lists),length(M,1),M=[I]),L1).
or
unpacking2(Lists,L1):-
findall(I,member([I],Lists),L1).

how can simulate nested loop in prolog?

how can i simulate this code in Prolog?
// L = an existing list ;
// function foo(var X, var Y)
result = new List();
for(int i=0;i<L.length;i++)
for(int j=0;j<L.length;j++){
result.add(foo(L.get(i), L.get(j));
}
nested loops are basically joins between sequences, and most of lists processing in Prolog is best expressed without indexing:
?- L=[a,b,c], findall(foo(X,Y), (member(X,L),member(Y,L)), R).
L = [a, b, c],
R = [foo(a, a), foo(a, b), foo(a, c), foo(b, a), foo(b, b), foo(b, c), foo(c, a), foo(c, b), foo(..., ...)].
edit
Sometime integers allow to capture the meaning in a simple way. As an example, my solution for one of the easier of Prolog context quizzes.
icecream(N) :-
loop(N, top(N)),
left, loop(N+1, center), nl,
loop(N+1, bottom(N)).
:- meta_predicate loop(+, 1).
loop(XH, PR) :-
H is XH,
forall(between(1, H, I), call(PR, I)).
top(N, I) :-
left, spc(N-I+1), pop,
( I > 1
-> pop,
spc(2*(I-2)),
pcl
; true
),
pcl, nl.
bottom(N, I) :-
left, spc(I-1), put(\), spc(2*(N-I+1)), put(/), nl.
center(_) :- put(/), put(\).
left :- spc(4).
pop :- put(0'().
pcl :- put(0')).
spc(Ex) :- V is Ex, forall(between(1, V, _), put(0' )).
Running in SWI-Prolog:
?- icecream(3).
()
(())
(( ))
/\/\/\/\
\ /
\ /
\ /
\/
true.
?- forall(loop(3,[X]>>loop(2,{X}/[Y]>>writeln(X-Y))),true).
1-1
1-2
2-1
2-2
3-1
3-2
true.
You can define a forto/4 meta-predicate easily. An example, taken from the Logtalk library loop object:
:- meta_predicate(forto(*, *, *, 0)).
forto(Count, FirstExp, LastExp, Goal) :-
First is FirstExp,
Last is LastExp,
forto_aux(Count, First, Last, 1, Goal).
:- meta_predicate(forto_aux(*, *, *, *, 0)).
forto_aux(Count, First, Last, Increment, Goal) :-
( First =< Last ->
\+ \+ (Count = First, call(Goal)),
Next is First + Increment,
forto_aux(Count, Next, Last, Increment, Goal)
; true
).
Example goal:
?- loop::forto(I, 1, 2, loop::forto(J, 1, 3, (write(I-J), nl))).
1-1
1-2
1-3
2-1
2-2
2-3
true.
Some Prolog compilers also provide built-in or library support for "logical loops" with good expressive power. Examples are (in alphabetic order) B-Prolog, ECLiPSe, and SICStus Prolog. Check the documentation of those systems for details. If you need a portable solution across most Prolog systems, check Logtalk's library documentation. Or simply take the above examples and define your own loop meta-predicates.
you can use this predicate using SICStus-prolog for looping variables I,J until N and get all of them inside fact foo/2 mentioned below successively ;
Code
loop(N) :- for(I,0,N),param(N) do
for(J,0,N),param(I) do
write(foo(I,J)),nl.
Result
| ?- loop(2).
foo(0,0)
foo(0,1)
foo(0,2)
foo(1,0)
foo(1,1)
foo(1,2)
foo(2,0)
foo(2,1)
foo(2,2)
yes

Solution to Smullyan's numerical machines

Here I propose to find a solution to Smullyan's numerical machines as defined here.
Problem statement
They're machines that take a list of digits as input, and transform it to another list of digits following some rules based on the pattern of the input.
Here are the rules of the machine given in the link above, expressed a bit more formally.
Let say M is the machine, and M(X) is the transformation of X.
We define a few rules like this:
M(2X) = X
M(3X) = M(X)2M(X)
M(4X) = reverse(M(X)) // reverse the order of the list.
M(5X) = M(X)M(X)
And anything that does not match any rule is rejected.
Here are a few examples:
M(245) = 45
M(3245) = M(245)2M(245) = 45245
M(43245) = reverse(M(3245)) = reverse(45245) = 54254
M(543245) = M(43245)M(43245) = 5425454254
And the questions are, find X such that:
M(X) = 2
M(X) = X
M(X) = X2X
M(X) = reverse(X)
M(X) = reverse(X2X)reverse(X2X)
Here is a second example a bit more complex with the exhaustive search (especially if I want the first 10 or 100 solutions).
M(1X2) = X
M(3X) = M(X)M(X)
M(4X) = reverse(M(X))
M(5X) = truncate(M(X)) // remove the first element of the list truncate(1234) = 234. Only valid if M(X) has at least 2 elements.
M(6X) = 1M(X)
M(7X) = 2M(X)
Questions:
M(X) = XX
M(X) = X
M(X) = reverse(X)
(Non-)Solutions
Writing a solver in Prolog is pretty straightforward. Except that it's just exhaustive exploration (a.k.a brute force) and may take some time for some set of rules.
I tried but couldn't express this problem in terms of logic constraints with CLP(FD), so I tried CHR (Constraint Handling Rules) to express this in terms of constraints on lists (especially append constraints), but no matter how I express it, it always boils down to an exhaustive search.
Question
Any idea what approach I could take to resolve any problem of this kind in a reasonable amount of time?
Ideally I would like to be able to generate all the solutions shorter than some bound.
Let's look at your "a bit more complex" problem. Exhaustive search works excellently!
Here is a comparison with Серге́й's solution which can be improved significantly by factoring the common goals:
m([1|A], X) :-
A = [_|_],
append(X, [2], A).
m([E | X], Z) :-
m(X, Y),
( E = 3,
append(Y, Y, Z)
; E = 4,
reverse(Y, Z)
; E = 5,
Y = [_ | Z]
; E = 6,
Z = [1 | Y]
; E = 7,
Z = [2 | Y]
).
For query time(findall(_, (question3(X), write(X), nl), _)). I get with B 8.1, SICStus 4.3b8:
Серге́й B tabled 104.542s
Серге́й B 678.394s
false B 16.013s
false B tabled 53.007s
Серге́й SICStus 439.210s
false SICStus 7.990s
Серге́й SWI 1383.678s, 5,363,110,835 inferences
false SWI 44.743s, 185,136,302 inferences
The additional questions are not that difficult to answer. Only SICStus with above m/2 and
call_nth/2:
| ?- time(call_nth( (
length(Xs0,N),append(Xs0,Xs0,Ys),m(Xs0,Ys),
writeq(Ys),nl ), 10)).
[4,3,7,4,3,1,4,3,7,4,3,1,2,4,3,7,4,3,1,4,3,7,4,3,1,2]
[3,4,7,4,3,1,3,4,7,4,3,1,2,3,4,7,4,3,1,3,4,7,4,3,1,2]
[4,3,7,3,4,1,4,3,7,3,4,1,2,4,3,7,3,4,1,4,3,7,3,4,1,2]
[3,4,7,3,4,1,3,4,7,3,4,1,2,3,4,7,3,4,1,3,4,7,3,4,1,2]
[3,5,4,5,3,1,2,2,1,3,5,4,5,3,1,2,3,5,4,5,3,1,2,2,1,3,5,4,5,3,1,2]
[3,5,5,3,4,1,2,1,4,3,5,5,3,4,1,2,3,5,5,3,4,1,2,1,4,3,5,5,3,4,1,2]
[5,4,7,4,3,3,1,2,5,4,7,4,3,3,1,2,5,4,7,4,3,3,1,2,5,4,7,4,3,3,1,2]
[4,7,4,5,3,3,1,2,4,7,4,5,3,3,1,2,4,7,4,5,3,3,1,2,4,7,4,5,3,3,1,2]
[5,4,7,3,4,3,1,2,5,4,7,3,4,3,1,2,5,4,7,3,4,3,1,2,5,4,7,3,4,3,1,2]
[3,5,4,7,4,3,1,_2735,3,5,4,7,4,3,1,2,3,5,4,7,4,3,1,_2735,3,5,4,7,4,3,1,2]
196660ms
| ?- time(call_nth( (
length(Xs0,N),m(Xs0,Xs0),
writeq(Xs0),nl ), 10)).
[4,7,4,3,1,4,7,4,3,1,2]
[4,7,3,4,1,4,7,3,4,1,2]
[5,4,7,4,3,1,_2371,5,4,7,4,3,1,2]
[4,7,4,5,3,1,_2371,4,7,4,5,3,1,2]
[5,4,7,3,4,1,_2371,5,4,7,3,4,1,2]
[3,5,4,7,4,1,2,3,5,4,7,4,1,2]
[4,3,7,4,5,1,2,4,3,7,4,5,1,2]
[3,4,7,4,5,1,2,3,4,7,4,5,1,2]
[4,7,5,3,6,4,1,4,7,5,3,6,4,2]
[5,4,7,4,3,6,1,5,4,7,4,3,6,2]
6550ms
| ?- time(call_nth( (
length(Xs0,N),reverse(Xs0,Ys),m(Xs0,Ys),
writeq(Ys),nl ), 10)).
[2,1,3,4,7,1,3,4,7]
[2,1,4,3,7,1,4,3,7]
[2,1,3,5,4,7,_2633,1,3,5,4,7]
[2,1,5,4,7,3,2,1,5,4,7,3]
[2,4,6,3,5,7,1,4,6,3,5,7]
[2,6,3,5,4,7,1,6,3,5,4,7]
[2,_2633,1,5,3,4,7,_2633,1,5,3,4,7]
[2,_2633,1,5,4,3,7,_2633,1,5,4,3,7]
[2,1,3,4,4,4,7,1,3,4,4,4,7]
[2,1,3,4,5,6,7,1,3,4,5,6,7]
1500ms
Here is another improvement to #Celelibi's improved version (cele_n). Roughly, it gets a factor of two by constraining the length of the first argument, and another factor of two by pretesting the two versions.
cele_n SICStus 2.630s
cele_n SWI 12.258s 39,546,768 inferences
cele_2 SICStus 0.490s
cele_2 SWI 2.665s 9,074,970 inferences
appendh([], [], Xs, Xs).
appendh([_, _ | Ws], [X | Xs], Ys, [X | Zs]) :-
appendh(Ws, Xs, Ys, Zs).
m([H|A], X) :-
A = [_|_], % New
m(H, X, A).
m(1, X, A) :-
append(X, [2], A).
m(3, X, A) :-
appendh(X, B, B, X),
m(A, B).
m(4, X, A) :-
reverse(X, B),
m(A, B).
m(5, X, A) :-
X = [_| _],
m(A, [_|X]).
m(H1, [H2 | B], A) :-
\+ \+ ( H2 = 1 ; H2 = 2 ), % New
m(A, B),
( H1 = 6, H2 = 1
; H1 = 7, H2 = 2
).
answer3(X) :-
between(1, 13, N),
length(X, N),
reverse(X, A),
m(X, A).
run :-
time(findall(X, (answer3(X), write(X), nl), _)).
I propose here another solution which is basically exhaustive exploration. Given the questions, if the length of the first argument of m/2 is known, the length of the second is known as well. If the length of the second argument is always known, this can be used to cut down the search earlier by propagating some constraints down to the recursive calls. However, this is not compatible with the optimization proposed by false.
appendh([], [], Xs, Xs).
appendh([_, _ | Ws], [X | Xs], Ys, [X | Zs]) :-
appendh(Ws, Xs, Ys, Zs).
m([1 | A], X) :-
append(X, [2], A).
m([3 | A], X) :-
appendh(X, B, B, X),
m(A, B).
m([4 | A], X) :-
reverse(X, B),
m(A, B).
m([5 | A], X) :-
B = [_, _ | _],
B = [_ | X],
m(A, B).
m([H1 | A], [H2 | B]) :-
m(A, B),
( H1 = 6, H2 = 1
; H1 = 7, H2 = 2
).
answer3(X) :-
between(1, 13, N),
length(X, N),
reverse(X, A),
m(X, A).
Here is the time taken respectively by: this code, this code when swapping recursive calls with the constraints of each case (similar to solution of Sergey Dymchenko), and the solution of false which factor the recursive calls. The test is run on SWI and search for all the solution whose length is less or equal to 13.
% 36,380,535 inferences, 12.281 CPU in 12.315 seconds (100% CPU, 2962336 Lips)
% 2,359,464,826 inferences, 984.253 CPU in 991.474 seconds (99% CPU, 2397214 Lips)
% 155,403,076 inferences, 47.799 CPU in 48.231 seconds (99% CPU, 3251186 Lips)
All measures are performed with the call:
?- time(findall(X, (answer3(X), writeln(X)), _)).
(I assume that this is about a list of digits, as you suggest. Contrary to the link you gave, which talks about numbers. There might be differences with leading zeros. I did not take the time to think that through)
First of all, Prolog is an excellent language to search brute force. For, even in that case, Prolog is able to mitigate combinatorial explosion. Thanks to the logic variable.
Your problem statements are essentially existential statements: Does there exist an X such that such and such is true. That's where Prolog is best at. The point is the way how you are asking the question. Instead of asking with concrete values like [1] and so on, simply ask for:
?- length(Xs, N), m(Xs,Xs).
Xs = [3,2,3], N = 3
; ... .
And similarly for the other queries. Note that there is no need to settle for concrete values! This makes the search certainly more expensive!
?- length(Xs, N), maplist(between(0,9),Xs), m(Xs,Xs).
Xs = [3,2,3], N = 3
; ... .
In this manner it is quite efficiently possible to find concrete solutions, should they exist. Alas, we cannot decide that a solution does not exist.
Just to illustrate the point, here is the answer for the "most complex" puzzle:
?- length(Xs0,N),
append(Xs0,[2|Xs0],Xs1),reverse(Xs1,Xs2),append(Xs2,Xs2,Xs3), m(Xs0,Xs3).
Xs0 = [4, 5, 3, 3, 2, 4, 5, 3, 3], N = 9, ...
; ... .
It comes up in no time. However, the query:
?- length(Xs0,N), maplist(between(0,9),Xs0),
append(Xs0,[2|Xs0],Xs1),reverse(Xs1,Xs2),append(Xs2,Xs2,Xs3), m(Xs0,Xs3).
is still running!
The m/2 I used:
m([2|Xs], Xs).
m([3|Xs0], Xs) :-
m(Xs0,Xs1),
append(Xs1,[2|Xs1], Xs).
m([4|Xs0], Xs) :-
m(Xs0, Xs1),
reverse(Xs1,Xs).
m([5|Xs0],Xs) :-
m(Xs0,Xs1),
append(Xs1,Xs1,Xs).
The reason why this is more effective is simply that a naive enumeration of all n digits has 10n different candidates, whereas Prolog will only search for 3n given by the 3 recursive rules.
Here is yet another optimization: All 3 rules have the very same recursive goal. So why do this thrice, when once is more than enough:
m([2|Xs], Xs).
m([X|Xs0], Xs) :-
m(Xs0,Xs1),
( X = 3,
append(Xs1,[2|Xs1], Xs)
; X = 4,
reverse(Xs1,Xs)
; X = 5,
append(Xs1,Xs1,Xs)
).
For the last query, this reduces from 410,014 inferences, 0.094s CPU down to 57,611 inferences, 0.015s CPU.
Edit: In a further optimization the two append/3 goals can be merged:
m([2|Xs], Xs).
m([X|Xs0], Xs) :-
m(Xs0,Xs1),
( X = 4,
reverse(Xs1,Xs)
; append(Xs1, Xs2, Xs),
( X = 3, Xs2 = [2|Xs1]
; X = 5, Xs2 = Xs1
)
).
... which further reduces execution to 39,096 inferences and runtime by 1ms.
What else can be done? The length is bounded by the length of the "input". If n is the length of the input, then 2(n-1)-1 is the longest output. Is this helping somehow? Probably not.
Tabling (memoization) can help with harder variants of the problem.
Here is my implementation for the third question of second example in B-Prolog (returns all solutions of length 13 or less):
:- table m/2.
m(A, X) :-
append([1 | X], [2], A).
m([3 | X], Z) :-
m(X, Y),
append(Y, Y, Z).
m([4 | X], Z) :-
m(X, Y),
reverse(Y, Z).
m([5 | X], Z) :-
m(X, Y),
Y = [_ | Z].
m([6 | X], Z) :-
m(X, Y),
Z = [1 | Y].
m([7 | X], Z) :-
m(X, Y),
Z = [2 | Y].
question3(X) :-
between(1, 13, N),
length(X, N),
reverse(X, Z), m(X, Z).
Run:
B-Prolog Version 8.1, All rights reserved, (C) Afany Software 1994-2014.
| ?- cl(smullyan2).
cl(smullyan2).
Compiling::smullyan2.pl
compiled in 2 milliseconds
loading...
yes
| ?- time(findall(_, (question3(X), writeln(X)), _)).
time(findall(_, (question3(X), writeln(X)), _)).
[7,3,4,1,7,3,4,1,2]
[7,4,3,1,7,4,3,1,2]
[3,7,4,5,1,2,3,7,4,5,1,2]
[7,4,5,3,1,_678,7,4,5,3,1,2]
[7,4,5,3,6,1,7,4,5,3,6,2]
[7,5,3,6,4,1,7,5,3,6,4,2]
[4,4,7,3,4,1,4,4,7,3,4,1,2]
[4,4,7,4,3,1,4,4,7,4,3,1,2]
[5,6,7,3,4,1,5,6,7,3,4,1,2]
[5,6,7,4,3,1,5,6,7,4,3,1,2]
[5,7,7,3,4,1,5,7,7,3,4,1,2]
[5,7,7,4,3,1,5,7,7,4,3,1,2]
[7,3,4,4,4,1,7,3,4,4,4,1,2]
[7,3,4,5,1,_698,7,3,4,5,1,_698,2]
[7,3,4,5,6,1,7,3,4,5,6,1,2]
[7,3,4,5,7,1,7,3,4,5,7,1,2]
[7,3,5,6,4,1,7,3,5,6,4,1,2]
[7,3,5,7,4,1,7,3,5,7,4,1,2]
[7,3,6,5,4,1,7,3,6,5,4,1,2]
[7,4,3,4,4,1,7,4,3,4,4,1,2]
[7,4,3,5,1,_698,7,4,3,5,1,_698,2]
[7,4,3,5,6,1,7,4,3,5,6,1,2]
[7,4,3,5,7,1,7,4,3,5,7,1,2]
[7,4,4,3,4,1,7,4,4,3,4,1,2]
[7,4,4,4,3,1,7,4,4,4,3,1,2]
[7,4,5,6,3,1,7,4,5,6,3,1,2]
[7,4,5,7,3,1,7,4,5,7,3,1,2]
[7,5,6,3,4,1,7,5,6,3,4,1,2]
[7,5,6,4,3,1,7,5,6,4,3,1,2]
[7,5,7,3,4,1,7,5,7,3,4,1,2]
[7,5,7,4,3,1,7,5,7,4,3,1,2]
[7,6,5,3,4,1,7,6,5,3,4,1,2]
[7,6,5,4,3,1,7,6,5,4,3,1,2]
CPU time 25.392 seconds.
yes
So it's less than a minute for this particular problem.
I don't think constraint programming will be of any help with this type of problem, especially with "find 20 first solutions" variant.
Update: running times of the same program on my computer on different systems:
B-Prolog 8.1 with tabling: 26 sec
B-Prolog 8.1 without tabling: 128 sec
ECLiPSe 6.1 #187: 122 sec
SWI-Prolog 6.2.6: 330 sec

How to duplicate the behavior of predefined length/2 in SWI-Prolog?

I'm trying to duplicate the behavior of the standard length/2 predicate. In particular, I want my predicate to work for bounded and unbounded arguments, like in the example below:
% Case 1
?- length(X, Y).
X = [],
Y = 0 ;
X = [_G4326],
Y = 1 ;
X = [_G4326, _G4329],
Y = 2 ;
X = [_G4326, _G4329, _G4332],
Y = 3 .
% Case 2
?- length([a,b,c], X).
X = 3.
% Case 3
?- length(X, 4).
X = [_G4314, _G4317, _G4320, _G4323].
% Case 4
?- length([a,b,c,d,e], 5).
true.
The plain&simple implementation:
my_length([], 0).
my_length([_|T], N) :- my_length(T, X), N is 1+X.
has some problems. In Case 3, after producing the correct answer, it goes into an infinite loop. Could this predicate be transformed into a deterministic one? Or non-deterministic that halts with false?
YES! But using red cut. See: https://stackoverflow.com/a/15123016/1545971
After some time, I've managed to code a set of predicates, that mimic the behavior of the build-in length/2. my_len_tail is deterministic and works correct in all Cases 1-4. Could it be done simpler?
my_len_tail(List, Len) :- var(Len)->my_len_tailv(List, 0, Len);
my_len_tailnv(List, 0, Len).
my_len_tailv([], Acc, Acc).
my_len_tailv([_|T], Acc, Len) :-
M is Acc+1,
my_len_tailv(T, M, Len).
my_len_tailnv([], Acc, Acc) :- !. % green!
my_len_tailnv([_|T], Acc, Len) :-
Acc<Len,
M is Acc+1,
my_len_tailnv(T, M, Len).
As #DanielLyons suggested in the comments, one can use clpfd to defer less than check. But it still leaves one problem: in Case 3 (my_len_clp(X, 3)) the predicate is nondeterministic. How it could be fixed?
:-use_module(library(clpfd)).
my_len_clp(List, Len) :- my_len_clp(List, 0, Len).
my_len_clp([], Acc, Acc).
my_len_clp([_|T], Acc, Len) :-
Acc#<Len,
M is Acc+1,
my_len_clp(T, M, Len).
It can be fixed using zcompare/3 from the CLP(FD) library. See: https://stackoverflow.com/a/15123146/1545971
In SWI-Prolog, the nondeterminism issue can be solved with CLP(FD)'s zcompare/3, which reifies the inequality to a term that can be used for indexing:
:- use_module(library(clpfd)).
my_length(Ls, L) :-
zcompare(C, 0, L),
my_length(Ls, C, 0, L).
my_length([], =, L, L).
my_length([_|Ls], <, L0, L) :-
L1 #= L0 + 1,
zcompare(C, L1, L),
my_length(Ls, C, L1, L).
Your example is now deterministic (since recent versions of SWI-Prolog perform just-in-time indexing):
?- my_length(Ls, 3).
Ls = [_G356, _G420, _G484].
All serious Prolog implementations ship with CLP(FD), and it makes perfect sense to use it here. Ask your vendor to also implement zcompare/3 or a better alternative if it is not already available.
For a set of test cases, please refer to this table and to the current definition in the prologue. There are many more odd cases to consider.
Defining length/2 with var/nonvar, is/2 and the like is not entirely trivial, because (is)/2 and arithmetic comparison is so limited. That is, they produce very frequently instantiation_errors instead of succeeding accordingly. Just to illustrate that point: It is trivial to define length_sx/2 using successor-arithmetics.
length_sx([], 0).
length_sx([_E|Es], s(X)) :-
length_sx(Es, X).
This definition is pretty perfect. It even fails for length_sx(L, L). Alas, successor arithmetics is not supported efficiently. That is, an integer i requires O(i) space and not O(log i) as one would expect.
The definition I would have preferred is:
length_fd([],0).
length_fd([_E|Es], L0) :-
L0 #> 0,
L1 #= L0-1,
length_fd(Es, L1).
Which is the most direct translation. It is quite efficient with a known length, but otherwise the overhead of constraints behind shows. Also, there is this asymmetry:
?- length_fd(L,0+0).
false.
?- length_fd(L,0+1).
L = [_A]
; false.
However, your definition using library(clpfd) is particularly elegant and efficient even for more elaborate cases.. It isn't as fast as the built-in length...
?- time(( length_fd(L,N),N=1000 )).
% 29,171,112 inferences, 4.110 CPU in 4.118 seconds (100% CPU, 7097691 Lips)
L = [_A,_B,_C,_D,_E,_F,_G,_H,_I|...], N = 1000
; ... .
?- time(( my_len_clp(L,N),N=10000 )).
% 1,289,977 inferences, 0.288 CPU in 0.288 seconds (100% CPU, 4484310 Lips)
L = [_A,_B,_C,_D,_E,_F,_G,_H,_I|...], N = 10000
; ... .
?- time(( length(L,N),N=10000 )).
% 30,003 inferences, 0.006 CPU in 0.006 seconds (100% CPU, 4685643 Lips)
L = [_A,_B,_C,_D,_E,_F,_G,_H,_I|...], N = 10000
; ... .
... but then it is able to handle constraints correctly:
?- N in 1..2, my_len_clp(L,N).
N = 1, L = [_A]
; N = 2, L = [_A, _B]
; false.
?- N in 1..2, length(L,N).
N = 1, L = [_A]
; N = 2, L = [_A, _B]
; loops.
I am not especially confident in this answer but my thinking is no, you have to do some extra work to make Prolog do the right thing for length/2, which is a real shame because it's such a great "tutorial" predicate in the simplest presentation.
I submit as proof, the source code to this function in SWI-Prolog and the source in GNU Prolog. Neither of these is a terse, cute trick, and it looks to me like they both work by testing the arguments and then deferring processing to different internal functions depending on which argument is instantiated.
I would love to be wrong about this though. I have often wondered why it is, for instance, so easy to write member/2 which does the right thing but so hard to write length/2 which does. Prolog isn't great at arithmetic, but is it really that bad? Here's hoping someone else comes along with a better answer.
(I've tried to edit #false's response, but it was rejected)
my_len_tail/2 is faster (in terms of both the number of inferences and actual time) than buldin length/2 when generating a list, but has problem with N in 1..2 constraint.
?- time(( my_len_tail(L,N),N=10000000 )).
% 20,000,002 inferences, 2.839 CPU in 3.093 seconds (92% CPU, 7044193 Lips)
L = [_G67, _G70, _G73, _G76, _G79, _G82, _G85, _G88, _G91|...],
N = 10000000 .
?- time(( length(L,N),N=10000000 )).
% 30,000,004 inferences, 3.557 CPU in 3.809 seconds (93% CPU, 8434495 Lips)
L = [_G67, _G70, _G73, _G76, _G79, _G82, _G85, _G88, _G91|...],
N = 10000000 .
This works for all your test cases (but it has red cut):
my_length([], 0).
my_length([_|T], N) :-
( integer(N) ->
!,
N > 0,
my_length(T, X), N is 1 + X, !
;
my_length(T, X), N is 1 + X
).
implementation
goal_expansion((_lhs_ =:= _rhs_),(when(ground(_rhs_),(_lhs_ is _rhs_)))) .
:- op(2'1,'yfx','list') .
_list_ list [size:_size_] :-
_list_ list [size:_size_,shrink:_shrink_] ,
_list_ list [size:_size_,shrink:_shrink_,size:_SIZE_] .
_list_ list [size:0,shrink:false] .
_list_ list [size:_size_,shrink:true] :-
when(ground(_size_),(_size_ > 0)) .
[] list [size:0,shrink:false,size:0] .
[_car_|_cdr_] list [size:_size_,shrink:true,size:_SIZE_] :-
(_SIZE_ =:= _size_ - 1) ,
(_size_ =:= _SIZE_ + 1) ,
_cdr_ list [size:_SIZE_] .
testing
/*
?- L list Z .
L = [],
Z = [size:0] ? ;
L = [_A],
Z = [size:1] ? ;
L = [_A,_B],
Z = [size:2] ? ;
L = [_A,_B,_C],
Z = [size:3] ?
yes
?- L list [size:0] .
L = [] ? ;
no
?- L list [size:1] .
L = [_A] ? ;
no
?- L list [size:2] .
L = [_A,_B] ? ;
no
?- [] list [size:S] .
S = 0 ? ;
no
?- [a] list [size:S] .
S = 1 ? ;
no
?- [a,b] list [size:S] .
S = 2 ? ;
no
?- [a,b,c] list [size:S] .
S = 3 ? ;
no
?-
*/

Resources