Complexity of counting matching elements in two sequences using `library(aggregate)` - prolog

We want to count the correspondences between two (possibly long) strings which happen to represent DNA sequences. The sequences are lists-of-chars where the char is taken from a,c,t,g,'_', with the '_' a "don't know" placeholder which never corresponds to anything, even itself. In this case, we employ library(aggregate) (thanks to CapelliC for the idea):
match(Seq1,Seq2,Count) :-
aggregate_all(count,
(
nth1(Pos,Seq1,X),
nth1(Pos,Seq2,X),
memberchk(X,[a,c,g,t])
),
N).
This approach can be compared to a "straightforward" approach where one would set up a (tail-recursive) recursion that just walks down both sequences in tandem and compares elements pairwise, counting as it goes.
As the sequences can be very large, algorithmic complexity becomes of some interest.
One would expect, with n = length(sequence) and both sequences the same length:
Straightforward approach: complexity is O(n)
aggregation approach: complexity is O(n²)
What is the (time and maybe space) complexity of the above algorithm and why?
Test code
To complement the above, an SWI-Prolog based plunit test code block:
:- begin_tests(atcg).
wrap_match(String1,String2,Count) :-
atom_chars(String1,Seq1),
atom_chars(String2,Seq2),
fit(Seq1,Seq1,0,Count).
test("string 1 empty",nondet) :-
wrap_match("atcg","",Count),
assertion(Count == 0).
test("string 2 empty") :-
wrap_match("","atcg",Count),
assertion(Count == 0).
test("both strings empty") :-
wrap_match("","",Count),
assertion(Count == 0).
test("both strings match, 1 char only") :-
wrap_match("a","a",Count),
assertion(Count == 1).
test("both strings match") :-
wrap_match("atcgatcgatcg","atcgatcgatcg",Count),
assertion(MatchCount == 12).
test("both strings match with underscores") :-
wrap_match("_TC_ATCG_TCG","_TC_ATCG_TCG",Count),
assertion(MatchCount == 9).
test("various mismatches 1") :-
wrap_match("atcgatcgatcg","atcgatcgatcg",Count),
assertion(MatchCount == 8).
test("various mismatches with underscores") :-
wrap_match("at_ga_cg__cg","atcgatcgatcg",Count),
assertion(Count == 8).
:- end_tests(atcg).
And so:
?- run_tests.
% PL-Unit: atcg ........ done
% All 8 tests passed
true.

Empirical info
After some manual data collection (something that cries out for automatization) using the code below, which outputs time elapsed and number of inferences made to the console:
gimme_random_sequence(Length,Seq) :-
length(Seq,Length),
maplist(
[E]>>(random_between(0,3,Ix),nth0(Ix,[a,t,c,g],E)),
Seq).
how_fast(Length) :-
gimme_random_sequence(Length,Seq1),
gimme_random_sequence(Length,Seq2),
time(match(Seq1,Seq2,_)).
... and a bit of graph fumbling in LibreOffice Calc (my ggplot skills are rusty), we have empirical data that this algorithm's cost is
O((length(sequence))²).
Count,Inferences,Seconds,milliseconds,megainferences
1000,171179,0.039,39,0.171179
2000,675661,0.097,97,0.675661
3000,1513436,0.186,186,1.513436
4000,2684639,0.327,327,2.684639
5000,4189172,0.502,502,4.189172
6000,6027056,0.722,722,6.027056
7000,8198103,1.002,1002,8.198103
8000,10702603,1.304,1304,10.702603
9000,13540531,1.677,1677,13.540531
10000,16711607,2.062,2062,16.711607
11000,20216119,2.449,2449,20.216119
20000,66756619,8.091,8091,66.756619
30000,150134731,17.907,17907,150.134731
40000,266846773,32.012,32012,266.846773
50000,416891749,52.942,52942,416.891749
60000,600269907,74.103,74103,600.269907

Never ever use functional programming idioms in Prolog that avoid backtracking, like maplist/4. This here, pair_member/4 and match3/3, should be a tick faster.
match2(Seq1, Seq2, Count) :-
( maplist([X,Y,X-Y]>>true, Seq1, Seq2, Seq3)
-> aggregate_all(count, (member(X-X, Seq3), X\='_'), Count)
; Count = 0 ).
pair_member(X, Y, [X|_], [Y|_]).
pair_member(X, Y, [_|L], [_|R]) :-
pair_member(X, Y, L, R).
match3(Seq1, Seq2, Count) :-
aggregate_all(count,
(pair_member(X, X, Seq1, Seq2), X \= '_'), Count).
gimme_random_sequence(Length, Seq) :-
length(Seq, Length),
maplist([E]>>(random_between(0,3,Ix), nth0(Ix, [a,t,c,g], E)), Seq).
test(N) :-
gimme_random_sequence(N, S1),
gimme_random_sequence(N, S2),
time(match2(S1, S2, Count)),
time(match3(S1, S2, Count)).
Woa! Its 10x times faster! Thanks to genius of SWI-Prolog how it
compiles the tail recursion in pair_member/4:
/* SWI-Prolog 8.3.21, MacBook Air 2019 */
?- set_prolog_flag(double_quotes, chars).
true.
?- X = "abc".
X = [a, b, c].
?- match2("_TC_ATCG_TCG","_TC_ATCG_TCG",Count).
Count = 9.
?- match3("_TC_ATCG_TCG","_TC_ATCG_TCG",Count).
Count = 9.
?- test(100000).
% 1,575,520 inferences, 0.186 CPU in 0.190 seconds (98% CPU, 8465031 Lips)
% 175,519 inferences, 0.018 CPU in 0.019 seconds (98% CPU, 9577595 Lips)
true.
Edit 29.04.2021:
Oh the irony, bifurcation backtracking is nevertheless challenging.
After fixing a misuse of library(apply_macros), I get:
?- test(100000).
% 374,146 inferences, 0.019 CPU in 0.019 seconds (99% CPU, 19379778 Lips)
% 174,145 inferences, 0.014 CPU in 0.014 seconds (99% CPU, 12400840 Lips)
true.
Does native member/2 contribute to the good maplist solution performance?
But I should do a better measure, with larger times durations.
Open Source:
Sequence Match Problem
https://gist.github.com/jburse/9fd22e8c3e8de6148fbd341817538ef6#file-sequence-pl

I think that it is interresting to observe that complexity O(n²) is not due to the aggregation approach itself, but to the fact that subgoal nth1(Pos,Seq1,X), nth1(Pos,Seq2,X) behaves as a "nested loop" (in the size n of the sequences).
Thus, it should be possible to create another algorithm that, even using aggregation, can have complexity O(n), as long as the "nested loop" is eliminated.
Algorithms to compare
% Original algorithm: Complexity O(n²)
match1(Seq1, Seq2, Count) :-
aggregate_all(count,
( nth1(Pos, Seq1, X),
nth1(Pos, Seq2, X),
memberchk(X, [a,c,g,t]) ),
Count).
% Proposed algorithm: Complexity O(n)
match2(Seq1, Seq2, Count) :-
( maplist([X,Y,X-Y]>>true, Seq1, Seq2, Seq3)
-> aggregate_all(count, (member(X-X, Seq3), X\='_'), Count)
; Count = 0 ).
gimme_random_sequence(Length, Seq) :-
length(Seq, Length),
maplist([E]>>(random_between(0,3,Ix), nth0(Ix, [a,t,c,g], E)), Seq).
test(N) :-
gimme_random_sequence(N, S1),
gimme_random_sequence(N, S2),
time(match1(S1, S2, Count)),
time(match2(S1, S2, Count)).
Simple empirical results
?- test(10000).
% 16,714,057 inferences, 1.156 CPU in 1.156 seconds (100% CPU, 14455401 Lips)
% 39,858 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
true.
?- test(20000).
% 66,761,535 inferences, 4.594 CPU in 4.593 seconds (100% CPU, 14533123 Lips)
% 79,826 inferences, 0.016 CPU in 0.016 seconds (100% CPU, 5108864 Lips)
true.
?- test(40000).
% 266,856,213 inferences, 19.734 CPU in 19.841 seconds (99% CPU, 13522405 Lips)
% 159,398 inferences, 0.016 CPU in 0.015 seconds (104% CPU, 10201472 Lips)
true.
?- test(80000).
% 1,067,046,835 inferences, 77.203 CPU in 77.493 seconds (100% CPU, 13821291 Lips)
% 320,226 inferences, 0.047 CPU in 0.047 seconds (100% CPU, 6831488 Lips)
true.
Edit 30/04/2021:
Does nth1(I,S,X), nth1(I,S,X) really work as nested loop?
To see that the answer to this question is yes, consider the following simple implementation of nth/3, that counts the number of rounds needed to find each solution, using a global flag:
nth(Index, List, Item) :-
( var(Index)
-> nth_nondet(1, Index, List, Item)
; integer(Index)
-> nth_det(Index, List, Item)
).
nth_det(1, [Item|_], Item) :- !.
nth_det(Index, [_|Rest], Item) :-
flag(rounds, Rounds, Rounds+1),
Index1 is Index - 1,
nth_det(Index1, Rest, Item).
nth_nondet(Index, Index, [Item|_], Item).
nth_nondet(Acc, Index, [_|Rest], Item) :-
flag(rounds, Rounds, Rounds+1),
Acc1 is Acc + 1,
nth_nondet(Acc1, Index, Rest, Item).
To get the number of rounds, you can ask:
?- flag(rounds,_,0), nth(5,[a,b,c,d,e],X), flag(rounds,Rounds,Rounds).
X = e,
Rounds = 4.
Now, using this predicate, we can create a predicate to count the number of rounds of the goal nth(I,L,X), nth(I,L,X), for lists of different lengths:
count_rounds :-
forall(between(1, 10, N),
( Length is 10*N,
count_rounds(Length, Rounds),
writeln(rounds(Length) = Rounds)
)).
count_rounds(Length, _) :-
numlist(1, Length, List),
flag(rounds, _, 0),
nth(Index, List, Item),
nth(Index, List, Item),
fail.
count_rounds(_, Rounds) :-
flag(rounds, Rounds, Rounds).
Empirical results:
?- count_rounds.
rounds(10) = 55
rounds(20) = 210
rounds(30) = 465
rounds(40) = 820
rounds(50) = 1275
rounds(60) = 1830
rounds(70) = 2485
rounds(80) = 3240
rounds(90) = 4095
rounds(100) = 5050
As we can see, the goal nth(I,L,X), nth(I,L,X) computes half of a square matrix of order n (including its diagonal). Thus, the number of rounds for a list of length n is rounds(n) = (n² + n)/2. Hence, the time complexity of this goal is O(n²).
Remark The implementation of the library predicate nth1/3 is a little more efficient than that of predicate nth/3considered for this experiment. Nevertheless, the time complexity of goal nth1(I,S,X), nth1(I,S,X)still is O(n²).

This is a followup of #MostowskiCollapse answer, where I have applied the same optimization that Gertjan van Noord provided for member/2 to pair_member/4, but I have renamed it to member/4.
member(X, Y, [XH|XT], [YH|YT]) :-
member_(XT, YT, X, Y, XH, YH).
member_(_, _, X,Y, X,Y).
member_([XH|XT],[YH|YT], X,Y, _,_) :-
member_(XT,YT, X,Y, XH,YH).
match4(Seq1, Seq2, Count) :-
aggregate_all(count,
(member(X, X, Seq1, Seq2), X \= '_'), Count).
test(N) :-
gimme_random_sequence(N, S1),
gimme_random_sequence(N, S2),
%time(match2(S1, S2, Count)),
time(match3(S1, S2, Count)),
time(match4(S1, S2, Count)).
...
with lists of length 1.000.000 I get
% 1,751,758 inferences, 0.835 CPU in 0.835 seconds (100% CPU, 2098841 Lips)
% 1,751,757 inferences, 0.637 CPU in 0.637 seconds (100% CPU, 2751198 Lips)
that is, a gain of about 25%...

Related

Prolog program that swaps the two halves of a list

I am new to this language and am having trouble coming up with a solution to this problem. The program must implement the following cases.
Both variables are instantiated:
pivot( [1,2,3,4,5,6,7], [5,6,7,4,1,2,3] ).`
yields a true/yes result.
Only Before is instantiated:
pivot( [1,2,3,4,5,6], R ).
unifies R = [4,5,6,1,2,3] as its one result.
Only After is instantiated:
pivot(L, [1,2]).
unifies L = [2,1] as its one result.
Neither variable is instantiated:
pivot(L, R).
is undefined (since results are generated arbitrarily).
If by pivot, you mean to split the list in 2 and swap the halves, then something like this would work.
First, consider the normal case: If you have an instantiated list, pivoting it is trivial. You just need to
figure out half the length of the list
break it up into
a prefix, consisting of that many items, and
a suffix, consisting of whatever is left over
concatenate those two lists in reverse order
Once you have that, everything else is just a matter of deciding which variable is bound and using that as the source list.
It is a common Prolog idiom to have a single "public" predicate that invokes a "private" worker predicate that does the actual work.
Given that the problem statement requires that at least one of the two variable in your pivot/2 must be instantiated, we can define our public predicate along these lines:
pivot( Ls , Rs ) :- nonvar(Ls), !, pivot0(Ls,Rs) .
pivot( Ls , Rs ) :- nonvar(Rs), !, pivot0(Rs,Ls) .
If Ls is bound, we invoke the worker, pivot0/2 with the arguments as-is. But if Ls is unbound, and Rs is bound, we invoke it with the arguments reversed. The cuts (!) are there to prevent the predicate from succeeding twice if invoked with both arguments bound (pivot([a,b,c],[a,b,c]).).
Our private helper, pivot0/2 is simple, because it knows that the 1st argument will always be bound:
pivot0( Ls , Rs ) :- % to divide a list in half and exchange the halves...
length(Ls,N0) , % get the length of the source list
N is N0 // 2 , % divide it by 2 using integer division
length(Pfx,N) , % construct a unbound list of the desired length
append(Pfx,Sfx,Ls) , % break the source list up into its two halves
append(Sfx,Pfx,Rs) % put the two halves back together in the desired order
. % Easy!
In swi-prolog:
:- use_module(library(dcg/basics)).
pivot_using_dcg3(Lst, LstPivot) :-
list_first(Lst, LstPivot, L1, L2, IsList),
phrase(piv3_up(L1), L1, L2),
% Improve determinism
(IsList = true -> ! ; true).
piv3_up(L), string(Ri), string(M), string(Le) --> piv3(L, Le, M, Ri).
piv3([], [], [], Ri) --> [], remainder(Ri).
piv3([_], [], [H], Ri) --> [H], remainder(Ri).
piv3([_, _|Lst], [H|T], M, Ri) --> [H], piv3(Lst, T, M, Ri).
% From 2 potential lists, rearrange them in order of usefulness
list_first(V1, V2, L1, L2, IsList) :-
( is_list(V1) ->
L1 = V1, L2 = V2,
IsList = true
; L1 = V2, L2 = V1,
(is_list(L1) -> IsList = true ; IsList = false)
).
Is general and deterministic, with good performance:
?- time(pivot_using_dcg3(L, P)).
% 18 inferences, 0.000 CPU in 0.000 seconds (88% CPU, 402441 Lips)
L = P, P = [] ;
% 8 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 238251 Lips)
L = P, P = [_] ;
% 10 inferences, 0.000 CPU in 0.000 seconds (87% CPU, 275073 Lips)
L = [_A,_B],
P = [_B,_A] ;
% 10 inferences, 0.000 CPU in 0.000 seconds (94% CPU, 313391 Lips)
L = [_A,_B,_C],
P = [_C,_B,_A] ;
% 12 inferences, 0.000 CPU in 0.000 seconds (87% CPU, 321940 Lips)
L = [_A,_B,_C,_D],
P = [_C,_D,_A,_B] ;
% 12 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 345752 Lips)
L = [_A,_B,_C,_D,_E],
P = [_D,_E,_C,_A,_B] ;
% 14 inferences, 0.000 CPU in 0.000 seconds (88% CPU, 371589 Lips)
L = [_A,_B,_C,_D,_E,_F],
P = [_D,_E,_F,_A,_B,_C] ;
?- numlist(1, 5000000, P), time(pivot_using_dcg3(L, P)).
% 7,500,018 inferences, 1.109 CPU in 1.098 seconds (101% CPU, 6759831 Lips)
The performance could be improved further, using difference lists for the final left-middle-right append, and cuts (sacrificing generality).

Finding intersection between two lists without duplicates in prolog

I am using Prolog and I am trying to find the intersection or the common elements between two lists and the result should not contain duplicates. In addition, the case of lists with different lengths should be handled. The result of the predicate should be as follows:
?-no_duplicates_intersection([a,v,a,c],[a,a,a,a,a],L).
L = a.
Actually, I found a question or two tackling the same issue, but the answers were way too long. I was wondering if there was a more straightforward and easier method using the following predicate, which returns the intersection between two lists with duplicates:
intersection_with_dulpicates([], [], []).
intersection_with_dulpicates([],M,[]).
intersection_with_dulpicates([X|Y],M,[X|Z]):-
member(X,M),
intersection_with_dulpicates(Y,M,Z).
intersection_with_dulpicates([X|Y],M,Z):-
\+member(X,M),
intersection_with_dulpicates(Y,M,Z).
Taking advantage of the built-in sort (which also removes duplicates):
intersection_without_duplicates(Lst1, Lst2, Intersection) :-
% Sort and remove duplicates from both
% The built-in sort is quick
sort(Lst1, Lst1Sorted),
sort(Lst2, Lst2Sorted),
intersect_sorted(Lst1Sorted, Lst2Sorted, Intersection).
intersect_sorted([], _Lst2Sorted, []).
intersect_sorted([H|T], LstSorted, Intersection) :-
( member_listsorted(H, LstSorted)
-> Intersection = [H|Intersection0]
; Intersection0 = Intersection
),
intersect_sorted(T, LstSorted, Intersection0).
member_listsorted(H, LstSorted) :-
member_listsorted_(LstSorted, H).
member_listsorted_([H|T], Elem) :-
( H #< Elem
-> member_listsorted_(T, Elem)
; H = Elem
).
Sample output in swi-prolog:
?- time(intersection_without_duplicates([a, b, c, d, b, c, d], [b, c, b, c, d],
I)).
% 31 inferences, 0.000 CPU in 0.000 seconds (89% CPU, 586277 Lips)
I = [b,c,d].
?- numlist(1, 10000, Lst1), numlist(5000, 12345, Lst2), time((intersection_without_duplicates(Lst1, Lst2, Intersection))).
% 25,060,003 inferences, 1.313 CPU in 1.297 seconds (101% CPU, 19090034 Lips)
Performance comparison with #TessellatingHeckler's suggestion:
?- numlist(1, 10000, Lst1), numlist(5000, 12345, Lst2), time((intersection(Lst1, Lst2, Both), sort(Both, Answer))).
% 35,001 inferences, 2.193 CPU in 2.167 seconds (101% CPU, 15957 Lips)
Following the design of intersection_with_dulpicates you can try
no_duplicates_intersection([], _L2, []).
no_duplicates_intersection([X|Y],L, Intersection):-
no_duplicates_intersection(Y,L,Cur_intersection),
( (member(X, Cur_intersection); \+ member(X,L))
-> Intersection = Cur_intersection
; Intersection = [X | Cur_intersection]).

Arithmetics in Prolog, represent a number using powers of 2

I have two numbers, let's name them N and K, and I want to write N using K powers of 2.
For example if N = 9 and K = 4, then N could be N = 1 + 2 + 2 + 4 (2^0 + 2^1 + 2^1 + 2^2).
My program should output something like N = [1,2,2,4].
I am used to C++. I can't find a way to solve this problem in Prolog. Any help will be appreciated!
I thought this would be a few-liner using CLP(FD), but no dice. Can it be done simpler?
So here is the complete solution.
Don't think I came up with this in one attempt, there are a few iterations and dead ends in there.
:- use_module(library(debug)).
% ---
% powersum(+N,+Target,?Solution)
% ---
% Entry point. Relate a list "Solution" of "N" integers to the integer
% "Target", which is the sum of 2^Solution[i].
% This works only in the "functional" direction
% "Compute Solution as powersum(N,Target)"
% or the "verification" direction
% "is Solution a solution of powersum(N,Target)"?
%
% An extension of some interest would be to NOT have a fixed "N".
% Let powersum/2 find appropriate N.
%
% The search is subject to exponential slowdown as the list length
% increases, so one gets bogged down quickly.
% ---
powersum(N,Target,Solution) :-
((integer(N),N>0,integer(Target),Target>=1) -> true ; throw("Bad args!")),
length(RS,N), % create a list RN of N fresh variables
MaxPower is floor(log(Target)/log(2)), % that's the largest power we will find in the solution
propose(RS,MaxPower,Target,0), % generate & test a solution into RS
reverse(RS,Solution), % if we are here, we found something! Reverse RS so that it is increasing
my_write(Solution,String,Value), % prettyprinting
format("~s = ~d\n",[String,Value]).
% ---
% propose(ListForSolution,MaxPowerHere,Target,SumSoFar)
% ---
% This is an integrate "generate-and-test". It is integrated
% to "fail fast" during proposal - we don't want to propose a
% complete solution, then compute the value for that solution
% and find out that we overshot the target. If we overshoot, we
% want to find ozut immediately!
%
% So: Propose a new value for the leftmost position L of the
% solution list. We are allowed to propose any integer for L
% from the sequence [MaxPowerHere,...,0]. "Target" is the target
% value we must not overshoot (indeed, we which must meet
% exactly at the end of recursion). "SumSoFar" is the sum of
% powers "to our left" in the solution list, to which we already
% committed.
propose([L|Ls],MaxPowerHere,Target,SumSoFar) :-
assertion(SumSoFar=<Target),
(SumSoFar=Target -> false ; true), % a slight optimization, no solution if we already reached Target!
propose_value(L,MaxPowerHere), % Generate: L is now (backtrackably) some value from [MaxPowerHere,...,0]
NewSum is (SumSoFar + 2**L),
NewSum =< Target, % Test; if this fails, we backtrack to propose_value/2 and will be back with a next L
NewMaxPowerHere = L, % Test passed; the next power in the sequence should be no larger than the current, i.e. L
propose(Ls,NewMaxPowerHere,Target,NewSum). % Recurse over rest-of-list.
propose([],_,Target,Target). % Terminal test: Only succeed if all values set and the Sum is the Target!
% ---
% propose_value(?X,+Max).
% ---
% Give me a new value X between [Max,0].
% Backtracks over monotonically decreasing integers.
% See the test code for examples.
%
% One could also construct a list of integers [Max,...,0], then
% use "member/2" for backtracking. This would "concretize" the predicate's
% behaviour with an explicit list structure.
%
% "between/3" sadly only generates increasing sequences otherwise one
% could use that. Maybe there is a "between/4" taking a step value somewhere?
% ---
propose_value(X,Max) :-
assertion((integer(Max),Max>=0)),
Max=X.
propose_value(X,Max) :-
assertion((integer(Max),Max>=0)),
Max>0, succ(NewMax,Max),
propose_value(X,NewMax).
% ---
% I like some nice output, so generate a string representing the solution.
% Also, recompute the value to make doubly sure!
% ---
my_write([L|Ls],String,Value) :-
my_write(Ls,StringOnTheRight,ValueOnTheRight),
Value is ValueOnTheRight + 2**L,
with_output_to(string(String),format("2^~d + ~s",[L,StringOnTheRight])).
my_write([L],String,Value) :-
with_output_to(string(String),format("2^~d",[L])),
Value is 2**L.
:- begin_tests(powersum).
% powersum(N,Target,Solution)
test(pv1) :- bagof(X,propose_value(X,3),Bag), Bag = [3,2,1,0].
test(pv2) :- bagof(X,propose_value(X,2),Bag), Bag = [2,1,0].
test(pv2) :- bagof(X,propose_value(X,1),Bag), Bag = [1,0].
test(pv3) :- bagof(X,propose_value(X,0),Bag), Bag = [0].
test(one) :- bagof(S,powersum(1,1,S),Bag), Bag = [[0]].
test(two) :- bagof(S,powersum(3,10,S),Bag), Bag = [[0,0,3],[1,2,2]].
test(three) :- bagof(S,powersum(3,145,S),Bag), Bag = [[0,4,7]].
test(four,fail) :- powersum(3,8457894,_).
test(five) :- bagof(S,powersum(9,8457894,S), Bag), Bag = [[1, 2, 5, 7, 9, 10, 11, 16, 23]]. %% VERY SLOW
:- end_tests(powersum).
rt :- run_tests(powersum).
Running test of 2 minutes due to the last unit testing line...
?- time(rt).
% PL-Unit: powersum ....2^0 = 1
.2^0 + 2^0 + 2^3 = 10
2^1 + 2^2 + 2^2 = 10
.2^0 + 2^4 + 2^7 = 145
..2^1 + 2^2 + 2^5 + 2^7 + 2^9 + 2^10 + 2^11 + 2^16 + 2^23 = 8457894
. done
% All 9 tests passed
% 455,205,628 inferences, 114.614 CPU in 115.470 seconds (99% CPU, 3971641 Lips)
true.
EDIT: With some suggestive comments from repeat, here is a complete, efficient CLP(FD) solution:
powersum2_(N, Target, Exponents, Solution) :-
length(Exponents, N),
MaxExponent is floor(log(Target) / log(2)),
Exponents ins 0..MaxExponent,
chain(Exponents, #>=),
maplist(exponent_power, Exponents, Solution),
sum(Solution, #=, Target).
exponent_power(Exponent, Power) :-
Power #= 2^Exponent.
powersum2(N, Target, Solution) :-
powersum2_(N, Target, Exponents, Solution),
labeling([], Exponents).
Ordering exponents by #>= cuts down the search space by excluding redundant permutations. But it is also relevant for the order of labeling (with the [] strategy).
The core relation powersum2_/4 posts constraints on the numbers:
?- powersum2_(5, 31, Exponents, Solution).
Exponents = [_954, _960, _966, _972, _978],
Solution = [_984, _990, _996, _1002, _1008],
_954 in 0..4,
_954#>=_960,
2^_954#=_984,
_960 in 0..4,
_960#>=_966,
2^_960#=_990,
_966 in 0..4,
_966#>=_972,
2^_966#=_996,
_972 in 0..4,
_972#>=_978,
2^_972#=_1002,
_978 in 0..4,
2^_978#=_1008,
_1008 in 1..16,
_984+_990+_996+_1002+_1008#=31,
_984 in 1..16,
_990 in 1..16,
_996 in 1..16,
_1002 in 1..16.
And then labeling searches for the actual solutions:
?- powersum2(5, 31, Solution).
Solution = [16, 8, 4, 2, 1] ;
false.
This solution is considerably more efficient than the other answers so far:
?- time(powersum2(9, 8457894, Solution)).
% 6,957,285 inferences, 0.589 CPU in 0.603 seconds (98% CPU, 11812656 Lips)
Solution = [8388608, 65536, 2048, 1024, 512, 128, 32, 4, 2].
Original version follows.
Here is another CLP(FD) solution. The idea is to express "power of two" as a "real" constraint, i.e, not as a predicate that enumerates numbers like lurker's power_of_2/1 does. It helps that the actual constraint to be expressed isn't really "power of two", but rather "power of two less than or equal to a known bound".
So here is some clumsy code to compute a list of powers of two up to a limit:
powers_of_two_bound(PowersOfTwo, UpperBound) :-
powers_of_two_bound(1, PowersOfTwo, UpperBound).
powers_of_two_bound(Power, [Power], UpperBound) :-
Power =< UpperBound,
Power * 2 > UpperBound.
powers_of_two_bound(Power, [Power | PowersOfTwo], UpperBound) :-
Power =< UpperBound,
NextPower is Power * 2,
powers_of_two_bound(NextPower, PowersOfTwo, UpperBound).
?- powers_of_two_bound(Powers, 1023).
Powers = [1, 2, 4, 8, 16, 32, 64, 128, 256|...] ;
false.
... and then to compute a constraint term based on this...
power_of_two_constraint(UpperBound, Variable, Constraint) :-
powers_of_two_bound(PowersOfTwo, UpperBound),
maplist(fd_equals(Variable), PowersOfTwo, PowerOfTwoConstraints),
constraints_operator_combined(PowerOfTwoConstraints, #\/, Constraint).
fd_equals(Variable, Value, Variable #= Value).
constraints_operator_combined([Constraint], _Operator, Constraint).
constraints_operator_combined([C | Cs], Operator, Constraint) :-
Constraint =.. [Operator, C, NextConstraint],
constraints_operator_combined(Cs, Operator, NextConstraint).
?- power_of_two_constraint(1023, X, Constraint).
Constraint = (X#=1#\/(X#=2#\/(X#=4#\/(X#=8#\/(X#=16#\/(X#=32#\/(X#=64#\/(X#=128#\/(... #= ... #\/ ... #= ...))))))))) ;
false.
... and then to post that constraint:
power_of_two(Target, Variable) :-
power_of_two_constraint(Target, Variable, Constraint),
call(Constraint).
?- power_of_two(1023, X).
X in ... .. ... \/ 4\/8\/16\/32\/64\/128\/256\/512 ;
false.
(Seeing this printed in this syntax shows me that I could simplify the code computing the constraint term...)
And then the core relation is:
powersum_(N, Target, Solution) :-
length(Solution, N),
maplist(power_of_two(Target), Solution),
list_monotonic(Solution, #=<),
sum(Solution, #=, Target).
list_monotonic([], _Operation).
list_monotonic([_X], _Operation).
list_monotonic([X, Y | Xs], Operation) :-
call(Operation, X, Y),
list_monotonic([Y | Xs], Operation).
We can run this without labeling:
?- powersum_(9, 1023, S).
S = [_9158, _9164, _9170, _9176, _9182, _9188, _9194, _9200, _9206],
_9158 in ... .. ... \/ 4\/8\/16\/32\/64\/128\/256\/512,
_9158+_9164+_9170+_9176+_9182+_9188+_9194+_9200+_9206#=1023,
_9164#>=_9158,
_9164 in ... .. ... \/ 4\/8\/16\/32\/64\/128\/256\/512,
_9170#>=_9164,
_9170 in ... .. ... \/ 4\/8\/16\/32\/64\/128\/256\/512,
_9176#>=_9170,
_9176 in ... .. ... \/ 4\/8\/16\/32\/64\/128\/256\/512,
_9182#>=_9176,
_9182 in ... .. ... \/ 4\/8\/16\/32\/64\/128\/256\/512,
_9188#>=_9182,
_9188 in ... .. ... \/ 4\/8\/16\/32\/64\/128\/256\/512,
_9194#>=_9188,
_9194 in ... .. ... \/ 4\/8\/16\/32\/64\/128\/256\/512,
_9200#>=_9194,
_9200 in ... .. ... \/ 4\/8\/16\/32\/64\/128\/256\/512,
_9206#>=_9200,
_9206 in ... .. ... \/ 4\/8\/16\/32\/64\/128\/256\/512 ;
false.
And it's somewhat quick when we label:
?- time(( powersum_(8, 255, S), labeling([], S) )), format('S = ~w~n', [S]), false.
% 561,982 inferences, 0.055 CPU in 0.055 seconds (100% CPU, 10238377 Lips)
S = [1,2,4,8,16,32,64,128]
% 1,091,295 inferences, 0.080 CPU in 0.081 seconds (100% CPU, 13557999 Lips)
false.
Contrast this with lurker's approach, which takes much longer even just to find the first solution:
?- time(binary_partition(255, 8, S)), format('S = ~w~n', [S]), false.
% 402,226,596 inferences, 33.117 CPU in 33.118 seconds (100% CPU, 12145562 Lips)
S = [1,2,4,8,16,32,64,128]
% 1,569,157 inferences, 0.130 CPU in 0.130 seconds (100% CPU, 12035050 Lips)
S = [1,2,4,8,16,32,64,128]
% 14,820,953 inferences, 1.216 CPU in 1.216 seconds (100% CPU, 12190530 Lips)
S = [1,2,4,8,16,32,64,128]
% 159,089,361 inferences, 13.163 CPU in 13.163 seconds (100% CPU, 12086469 Lips)
S = [1,2,4,8,16,32,64,128]
% 1,569,155 inferences, 0.134 CPU in 0.134 seconds (100% CPU, 11730834 Lips)
S = [1,2,4,8,16,32,64,128]
% 56,335,514 inferences, 4.684 CPU in 4.684 seconds (100% CPU, 12027871 Lips)
S = [1,2,4,8,16,32,64,128]
^CAction (h for help) ? abort
% 1,266,275,462 inferences, 107.019 CPU in 107.839 seconds (99% CPU, 11832284 Lips)
% Execution Aborted % got bored of waiting
However, this solution is slower than the one by David Tonhofer:
?- time(( powersum_(9, 8457894, S), labeling([], S) )), format('S = ~w~n', [S]), false.
% 827,367,193 inferences, 58.396 CPU in 58.398 seconds (100% CPU, 14168325 Lips)
S = [2,4,32,128,512,1024,2048,65536,8388608]
% 1,715,107,811 inferences, 124.528 CPU in 124.532 seconds (100% CPU, 13772907 Lips)
false.
versus:
?- time(bagof(S,powersum(9,8457894,S), Bag)).
2^1 + 2^2 + 2^5 + 2^7 + 2^9 + 2^10 + 2^11 + 2^16 + 2^23 = 8457894
% 386,778,067 inferences, 37.705 CPU in 37.706 seconds (100% CPU, 10258003 Lips)
Bag = [[1, 2, 5, 7, 9, 10, 11, 16|...]].
There's probably room to improve my constraints, or maybe some magic labeling strategy that will improve the search.
EDIT: Ha! Labeling from the largest to the smallest element changes the performance quite dramatically:
?- time(( powersum_(9, 8457894, S), reverse(S, Rev), labeling([], Rev) )), format('S = ~w~n', [S]), false.
% 5,320,573 inferences, 0.367 CPU in 0.367 seconds (100% CPU, 14495124 Lips)
S = [2,4,32,128,512,1024,2048,65536,8388608]
% 67 inferences, 0.000 CPU in 0.000 seconds (100% CPU, 2618313 Lips)
false.
So this is now about 100x as fast as David Tonhofer's version. I'm content with that :-)
Here's a scheme that uses CLP(FD). In general, when reasoning in the domain of integers in Prolog, CLP(FD) is a good way to go. The idea for this particular problem is to think recursively (as in many Prolog problems) and use a "bifurcation" approach.
As David said in his answer, solutions to problems like this don't just flow out on the first attempt. There are preliminary notions, trial implementations, tests, observations, and revisions that go into coming up with the solution to a problem. Even this one could use more work. :)
:- use_module(library(clpfd)).
% Predicate that succeeds for power of 2
power_of_2(1).
power_of_2(N) :-
N #> 1,
NH #= N // 2,
N #= NH * 2,
power_of_2(NH).
% Predicate that succeeds for a list that is monotonically ascending
ascending([_]).
ascending([X1,X2|Xs]) :-
X1 #=< X2,
ascending([X2|Xs]).
% Predicate that succeeds if Partition is a K-part partition of N
% where the parts are powers of 2
binary_partition(N, K, Partition) :-
binary_partition_(N, K, Partition),
ascending(Partition). % Only allow ascending lists as solutions
binary_partition_(N, 1, [N]) :- % base case
power_of_2(N).
binary_partition_(N, K, P) :-
N #> 1, % constraints on N, K
K #> 1,
length(P, K), % constraint on P
append(LL, LR, P), % conditions on left/right bifurcation
NL #> 0,
NR #> 0,
KL #> 0,
KR #> 0,
NL #=< NR, % don't count symmetrical cases
KL #=< KR,
N #= NL + NR,
K #= KL + KR,
binary_partition_(NL, KL, LL),
binary_partition_(NR, KR, LR).
This will provide correct results, but it also generates redundant solutions:
2 ?- binary_partition(9,4,L).
L = [1, 2, 2, 4] ;
L = [1, 2, 2, 4] ;
false.
As an exercise, you can figure out how to modify it so it only generates unique solutions. :)
my_power_of_two_bound(U,P):-
U #>= 2^P,
P #=< U,
P #>=0.
power2(X,Y):-
Y #= 2^X.
Query:
?- N=9,K=4,
length(_List,K),
maplist(my_power_of_two_bound(N),_List),
maplist(power2,_List,Answer),
chain(Answer, #=<),
sum(Answer, #=, N),
label(Answer).
Then:
Answer = [1, 2, 2, 4],
K = 4,
N = 9

Puzzle taken from Gardner

I'm trying to solve the following puzzle in Prolog:
Ten cells numbered 0,...,9 inscribe a 10-digit number such that each cell, say i, indicates the total number of occurrences of the digit i in this number. Find this number. The answer is 6210001000.
This is what I wrote in Prolog but I'm stuck, I think there is something wrong with my ten_digit predicate:
%count: used to count number of occurrence of an element in a list
count(_,[],0).
count(X,[X|T],N) :-
count(X,T,N2),
N is 1 + N2.
count(X,[Y|T],Count) :-
X \= Y,
count(X,T,Count).
%check: f.e. position = 1, count how many times 1 occurs in list and check if that equals the value at position 1
check(Pos,List) :-
count(Pos,List,Count),
valueOf(Pos,List,X),
X == Count.
%valueOf: get the value from a list given the index
valueOf(0,[H|_],H).
valueOf(I,[_|T],Z) :-
I2 is I-1,
valueOf(I2,T,Z).
%ten_digit: generate the 10-digit number
ten_digit(X):-
ten_digit([0,1,2,3,4,5,6,7,8,9],X).
ten_digit([],[]).
ten_digit([Nul|Rest],Digits) :-
check(Nul,Digits),
ten_digit(Rest,Digits).
How do I solve this puzzle?
Check out the clpfd constraint global_cardinality/2.
For example, using SICStus Prolog or SWI:
:- use_module(library(clpfd)).
ten_cells(Ls) :-
numlist(0, 9, Nums),
pairs_keys_values(Pairs, Nums, Ls),
global_cardinality(Ls, Pairs).
Sample query and its result:
?- time((ten_cells(Ls), labeling([ff], Ls))).
1,359,367 inferences, 0.124 CPU in 0.124 seconds (100% CPU, 10981304 Lips)
Ls = [6, 2, 1, 0, 0, 0, 1, 0, 0, 0] ;
319,470 inferences, 0.028 CPU in 0.028 seconds (100% CPU, 11394678 Lips)
false.
This gives you one solution, and also shows that it is unique.
CLP(FD) rules... solving this puzzle in plain Prolog is not easy...
ten_digit(Xs):-
length(Xs, 10),
assign(Xs, Xs, 0).
assign([], _, 10).
assign([X|Xs], L, P) :-
member(X, [9,8,7,6,5,4,3,2,1,0]),
count(L, P, X),
Q is P+1,
assign(Xs, L, Q),
count(L, P, X).
count(L, P, 0) :- maplist(\==(P), L).
count([P|Xs], P, C) :-
C > 0,
B is C-1,
count(Xs, P, B).
count([X|Xs], P, C) :-
X \== P,
C > 0,
count(Xs, P, C).
this is far less efficient than #mat solution:
?- time(ten_digit(L)),writeln(L).
% 143,393 inferences, 0.046 CPU in 0.046 seconds (100% CPU, 3101601 Lips)
[6,2,1,0,0,0,1,0,0,0]
L = [6, 2, 1, 0, 0, 0, 1, 0, 0|...] ;
% 11,350,690 inferences, 3.699 CPU in 3.705 seconds (100% CPU, 3068953 Lips)
false.
count/3 acts in a peculiar way... it binds free variables up to the current limit, then check no more are bounded.
edit adding a cut, the snippet becomes really fast:
...
assign(Xs, L, Q),
!, count(L, P, X).
?- time(ten_digit(L)),writeln(L).
% 137,336 inferences, 0.045 CPU in 0.045 seconds (100% CPU, 3075529 Lips)
[6,2,1,0,0,0,1,0,0,0]
L = [6, 2, 1, 0, 0, 0, 1, 0, 0|...] ;
% 3 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 54706 Lips)
false.
Sorry, I could not resist. This problem can also be conveniently expressed as a Mixed Integer Programming (MIP) model. A little bit more mathy than Prolog.
The results are the same:
---- VAR n digit i
LOWER LEVEL UPPER MARGINAL
digit0 -INF 6.0000 +INF .
digit1 -INF 2.0000 +INF .
digit2 -INF 1.0000 +INF .
digit3 -INF . +INF .
digit4 -INF . +INF .
digit5 -INF . +INF .
digit6 -INF 1.0000 +INF .
digit7 -INF . +INF .
digit8 -INF . +INF .
digit9 -INF . +INF .

Reversible numerical calculations in Prolog

While reading SICP I came across logic programming chapter 4.4. Then I started looking into the Prolog programming language and tried to understand some simple assignments in Prolog. I found that Prolog seems to have troubles with numerical calculations.
Here is the computation of a factorial in standard Prolog:
f(0, 1).
f(A, B) :- A > 0, C is A-1, f(C, D), B is A*D.
The issues I find is that I need to introduce two auxiliary variables (C and D), a new syntax (is) and that the problem is non-reversible (i.e., f(5,X) works as expected, but f(X,120) does not).
Naively, I expect that at the very least C is A-1, f(C, D) above may be replaced by f(A-1,D), but even that does not work.
My question is: Why do I need to do this extra "stuff" in numerical calculations but not in other queries?
I do understand (and SICP is quite clear about it) that in general information on "what to do" is insufficient to answer the question of "how to do it". So the declarative knowledge in (at least some) math problems is insufficient to actually solve these problems. But that begs the next question: How does this extra "stuff" in Prolog help me to restrict the formulation to just those problems where "what to do" is sufficient to answer "how to do it"?
is/2 is very low-level and limited. As you correctly observe, it cannot be used in all directions and is therefore not a true relation.
For reversible arithmetic, use your Prolog system's constraint solvers.
For example, SWI-Prolog's CLP(FD) manual contains the following definition of n_factorial/2:
:- use_module(library(clpfd)).
n_factorial(0, 1).
n_factorial(N, F) :- N #> 0, N1 #= N - 1, F #= N * F1, n_factorial(N1, F1).
The following example queries show that it can be used in all directions:
?- n_factorial(47, F).
F = 258623241511168180642964355153611979969197632389120000000000 ;
false.
?- n_factorial(N, 1).
N = 0 ;
N = 1 ;
false.
?- n_factorial(N, 3).
false.
Of course, this definition still relies on unification, and you can therefore not plug in arbitrary integer expressions. A term like 2-2 (which is -(2,2) in prefix notation) does not unfiy with 0. But you can easily allow this if you rewrite this to:
:- use_module(library(clpfd)).
n_factorial(N, F) :- N #= 0, F #= 1.
n_factorial(N, F) :- N #> 0, N1 #= N - 1, F #= N * F1, n_factorial(N1, F1).
Example query and its result:
?- n_factorial(2-2, -4+5).
true .
Forget about variables and think that A and B - is just a name for value which can be placed into that clause (X :- Y). to make it reachable. Think about X = (2 + (3 * 4)) in the way of data structures which represent mathematical expression. If you will ask prolog to reach goal f(A-1, B) it will try to find such atom f(A-1,B). or a rule (f(A-1,B) :- Z), Z. which will be unified to "success".
is/2 tries to unify first argument with result of interpreting second argument as an expression. Consider eval/2 as variant of is/2:
eval(0, 1-1). eval(0, 2-2). eval(1,2-1).
eval(Y, X-0):- eval(Y, X).
eval(Y, A+B):- eval(ValA, A), eval(ValB, B), eval(Y, ValA + ValB).
eval(4, 2*2).
eval(0, 0*_). eval(0, _*0).
eval(Y, X*1):- eval(Y, X).
eval(Y, 1*X):- eval(Y, X).
eval(Y, A*B):- eval(ValA, A), eval(ValB, B), eval(Y, ValA * ValB).
The reason why f(X,120) doesn't work is simple >/2 works only when its arguments is bound (i.e. you can't compare something not yet defined like X with anything else). To fix that you have to split that rule into:
f(A,B) :- nonvar(A), A > 0, C is A-1, f(C, D), B is A*D.
f(A,B) :- nonvar(B), f_rev(A, B, 1, 1).
% f_rev/4 - only first argument is unbound.
f_rev(A, B, A, B). % solution
f_rev(A, B, N, C):- C < B, NextN is (N+1), NextC is (C*NextN), f_rev(A, B, NextN, NextC).
Update: (fixed f_rev/4)
You may be interested in finite-domain solver. There was a question about using such things. By using #>/2 and #=/2 you can describe some formula and restrictions and then resolve them. But these predicates uses special abilities of some prolog systems which allows to associate name with some attributes which may help to narrow set of possible values by intersection of restriction. Some other systems (usually the same) allows you to reorder sequence of processing goals ("suspend").
Also member(X,[1,2,3,4,5,6,7]), f(X, 120) is probably doing the same thing what your "other queries" do.
If you are interested in logical languages in general you may also look at Curry language (there all non-pure functions is "suspended" until not-yed-defined value is unified).
In this answer we use clpfd, just like this previous answer did.
:- use_module(library(clpfd)).
For easy head-to-head comparison (later on), we call the predicate presented here n_fac/2:
n_fac(N_expr,F_expr) :-
N #= N_expr, % eval arith expr
F #= F_expr, % eval arith expr
n_facAux(N,F).
Like in this previous answer, n_fac/2 admits the use of arithmetic expressions.
n_facAux(0,1). % 0! = 1
n_facAux(1,1). % 1! = 1
n_facAux(2,2). % 2! = 2
n_facAux(N,F) :-
N #> 2,
F #> N, % redundant constraint
% to help `n_fac(N,N)` terminate
n0_n_fac0_fac(3,N,6,F). % general case starts with "3! = 6"
The helper predicate n_facAux/2 delegates any "real" work to n0_n_fac0_fac/4:
n0_n_fac0_fac(N ,N,F ,F).
n0_n_fac0_fac(N0,N,F0,F) :-
N0 #< N,
N1 #= N0+1, % count "up", not "down"
F1 #= F0*N1, % calc `1*2*...*N`, not `N*(N-1)*...*2*1`
F1 #=< F, % enforce redundant constraint
n0_n_fac0_fac(N1,N,F1,F).
Let's compare n_fac/2 and n_factorial/2!
?- n_factorial(47,F).
F = 258623241511168180642964355153611979969197632389120000000000
; false.
?- n_fac(47,F).
F = 258623241511168180642964355153611979969197632389120000000000
; false.
?- n_factorial(N,1).
N = 0
; N = 1
; false.
?- n_fac(N,1).
N = 0
; N = 1
; false.
?- member(F,[3,1_000_000]), ( n_factorial(N,F) ; n_fac(N,F) ).
false. % both predicates agree
OK! Identical, so far... Why not do a little brute-force testing?
?- time((F1 #\= F2,n_factorial(N,F1),n_fac(N,F2))).
% 57,739,784 inferences, 6.415 CPU in 7.112 seconds (90% CPU, 9001245 Lips)
% Execution Aborted
?- time((F1 #\= F2,n_fac(N,F2),n_factorial(N,F1))).
% 52,815,182 inferences, 5.942 CPU in 6.631 seconds (90% CPU, 8888423 Lips)
% Execution Aborted
?- time((N1 #> 1,N2 #> 1,N1 #\= N2,n_fac(N1,F),n_factorial(N2,F))).
% 99,463,654 inferences, 15.767 CPU in 16.575 seconds (95% CPU, 6308401 Lips)
% Execution Aborted
?- time((N1 #> 1,N2 #> 1,N1 #\= N2,n_factorial(N2,F),n_fac(N1,F))).
% 187,621,733 inferences, 17.192 CPU in 18.232 seconds (94% CPU, 10913552 Lips)
% Execution Aborted
No differences for the first few hundred values of N in 2..sup... Good!
Moving on: How about the following (suggested in a comment to this answer)?
?- n_factorial(N,N), false.
false.
?- n_fac(N,N), false.
false.
Doing fine! Identical termination behaviour... More?
?- N #< 5, n_factorial(N,_), false.
false.
?- N #< 5, n_fac(N,_), false.
false.
?- F in 10..100, n_factorial(_,F), false.
false.
?- F in 10..100, n_fac(_,F), false.
false.
Alright! Still identical termination properties! Let's dig a little deeper! How about the following?
?- F in inf..10, n_factorial(_,F), false.
... % Execution Aborted % does not terminate universally
?- F in inf..10, n_fac(_,F), false.
false. % terminates universally
D'oh! The first query does not terminate, the second does.
What a speedup! :)
Let's do some empirical runtime measurements!
?- member(Exp,[6,7,8,9]), F #= 10^Exp, time(n_factorial(N,F)) ; true.
% 328,700 inferences, 0.043 CPU in 0.043 seconds (100% CPU, 7660054 Lips)
% 1,027,296 inferences, 0.153 CPU in 0.153 seconds (100% CPU, 6735634 Lips)
% 5,759,864 inferences, 1.967 CPU in 1.967 seconds (100% CPU, 2927658 Lips)
% 22,795,694 inferences, 23.911 CPU in 23.908 seconds (100% CPU, 953351 Lips)
true.
?- member(Exp,[6,7,8,9]), F #= 10^Exp, time(n_fac(N,F)) ; true.
% 1,340 inferences, 0.000 CPU in 0.000 seconds ( 99% CPU, 3793262 Lips)
% 1,479 inferences, 0.000 CPU in 0.000 seconds (100% CPU, 6253673 Lips)
% 1,618 inferences, 0.000 CPU in 0.000 seconds (100% CPU, 5129994 Lips)
% 1,757 inferences, 0.000 CPU in 0.000 seconds (100% CPU, 5044792 Lips)
true.
Wow! Some more?
?- member(U,[10,100,1000]), time((N in 1..U,n_factorial(N,_),false)) ; true.
% 34,511 inferences, 0.004 CPU in 0.004 seconds (100% CPU, 9591041 Lips)
% 3,091,271 inferences, 0.322 CPU in 0.322 seconds (100% CPU, 9589264 Lips)
% 305,413,871 inferences, 90.732 CPU in 90.721 seconds (100% CPU, 3366116 Lips)
true.
?- member(U,[10,100,1000]), time((N in 1..U,n_fac(N,_),false)) ; true.
% 3,729 inferences, 0.001 CPU in 0.001 seconds (100% CPU, 2973653 Lips)
% 36,369 inferences, 0.004 CPU in 0.004 seconds (100% CPU, 10309784 Lips)
% 362,471 inferences, 0.036 CPU in 0.036 seconds (100% CPU, 9979610 Lips)
true.
The bottom line?
The code presented in this answer is as low-level as you should go: Forget is/2!
Redundant constraints can and do pay off.
The order of arithmetic operations (counting "up" vs "down") can make quite a difference, too.
If you want to calculate the factorial of some "large" N, consider using a different approach.
Use clpfd!
There are some things which you must remember when looking at Prolog:
There is no implicit return value when you call a predicate. If you want to get a value out of a call you need to add extra arguments which can be used to "return" values, the second argument in your f/2 predicate. While being more verbose it does have the benefit of being easy to return many values.
This means that automatically "evaluating" arguments in a call is really quite meaningless as there is no value to return and it is not done. So there are no nested calls, in this respect Prolog is flat. So when you call f(A-1, D) the first argument to f/2 is the structure A-1, or really -(A, 1) as - is an infix operator. So if you want to get the value from a call to foo into a call to bar you have to explicitly use a variable to do it like:
foo(..., X), bar(X, ...),
So you need a special predicate which forces arithmetic evaluation, is/2. It's second argument is a structure representing an arithmetic expression which it interprets, evaluates and unifies the result with its first argument, which can be either a variable or numerical value.
While in principle you can run things backwards with most things you can't. Usually it is only simple predicates working on structures for which it is possible, though there are some very useful cases where it is possible. is/2 doesn't work backwards, it would be exceptional if it did.
This is why you need the extra variables C and D and can't replace C is A-1, f(C, D) by f(A-1,D).
(Yes I know you don't make calls in Prolog, but evaluate goals, but we were starting from a functional viewpoint here)

Resources