Generate n-digit number w/o 1s and 0s in prolog - random

I am a beginner to prolog.
The task is to generate n-digit numbers without using ones and zeroes. How would this be done? Do you generate random numbers and then delete 1s and 0s (sounds inefficient)?

What you describe is definitely one way to do it. As you mention, it is not a particularly good way to do it, since it is so inefficient.
Nevertheless, the approach you mention is so common that it has its own name: It is often referred to as generate and test, since we first generate, and then either reject or accept the solution, or modify it further so that it satisfies all constraints.
A typically much more efficient approach is to first constrain the whole solution so that all requirements are expressed, and then to let the system search only within the already constrained space. This is especially easy to do in Prolog since it provides built-in constraints that you can post before you even start the search for solutions, and they will be automatically taken into account before and also during the search.
For example, you could do it as follows, using your Prolog system's CLP(FD) constraints to express the desired requirements over integers:
n_digits(N, Ds, Num) :-
length(Ds, N),
Ds ins 2..9,
reverse(Ds, Rs),
foldl(pow10, Rs, 0-0, _-Num).
pow10(D, Pow0-S0, Pow-S) :-
Pow #= Pow0 + 1,
S #= D*10^Pow0 + S0.
Thus, we represent the number as a list of digits, and relate this list to the corresponding integer using suitable constraints. The key point is that all this happens before a single solution is actually generated, and all solutions that are found will satisfy the stated constraints. Moreover, this is a very general relation that works in all directions, and we can use it to test, generate and complete solutions.
Here is an example query, asking for how such numbers with 3 digits look like in general:
?- n_digits(3, Ds, N).
In response, we get:
Ds = [_11114, _11120, _11126],
_11114 in 2..9,
_11114*100#=_11182,
_11182 in 200..900,
_11182+_11236#=N,
_11236 in 22..99,
_11282+_11126#=_11236,
_11282 in 20..90,
_11120*10#=_11282,
_11120 in 2..9,
_11126 in 2..9,
N in 222..999.
We can use label/1 to obtain concrete solutions:
?- n_digits(3, Ds, N), label(Ds).
Ds = [2, 2, 2],
N = 222 ;
Ds = [2, 2, 3],
N = 223 ;
Ds = [2, 2, 4],
N = 224 ;
Ds = [2, 2, 5],
N = 225 ;
etc.
When describing tasks where integers are involved, CLP(FD) constraints are often a very good fit and allow very general solutions.
For example, it is easy to incorporate additional requirements:
?- n_digits(3, Ds, N),
N #> 300,
label(Ds).
Ds = [3, 2, 2],
N = 322 ;
Ds = [3, 2, 3],
N = 323 ;
Ds = [3, 2, 4],
N = 324 ;
etc.
We're not in Sparta anymore. See clpfd for more information!

There is surely a lot of elegance in the clp(fd) solution. If you just want a random number with N digits 2..9, there is a more direct approach
n_digits(N, Ds, Num) :-
length(Ds, N),
maplist(random_between(0'2, 0'9), Ds),
number_codes(Num, Ds).
If you use between instead of random_between, you generate numbers as above. I've produced a comparison at http://swish.swi-prolog.org/p/JXELeTrX.swinb, where we can see
?- time(n_digits(1000, _Ds, N)).
6,010 inferences, 0.003 CPU in 0.003 seconds (100% CPU, 1814820 Lips)
Vs. using the clp(fd) solution which runs out of memory (256Mb) after 42 seconds.
?- time((n_digits(1000, _Ds, N), label(_Ds))).
62,143,001 inferences, 42.724 CPU in 42.723 seconds (100% CPU, 1454531 Lips)
Out of global stack

I would like to complement the existing answers with an additional version that I think is worth studying on its own.
Please consider the following:
n_digits(N, Ds, Expr) :-
length(Ds, N),
Ds ins 2..9,
foldl(pow10, Ds, 0, Expr).
pow10(D, S, S*10+D).
Sample query:
?- time((n_digits(1000, Ds, Expr), label(Ds))).
% 104,063 inferences, 0.013 CPU in 0.017 seconds (75% CPU, 8214635 Lips)
Ds = [2, 2, 2, 2, 2, 2, 2, 2, 2|...],
Expr = ((((... * ... + 2)*10+2)*10+2)*10+2)*10+2 .
Here, we also declaratively describe the list with constraints, and in addition build an arithmetic CLP(FD) expression that evaluates to the resulting number.
As with the other CLP(FD)-based version, this lets us easily post additional constraints on the number itself.
For example:
?- n_digits(1000, Ds, Expr),
Expr #> 5*10^999,
label(Ds).
Ds = [5, 2, 2, 2, 2, 2, 2, 2, 2|...],
Expr = ((((... * ... + 2)*10+2)*10+2)*10+2)*10+2 .
When you encounter a CLP(FD)-based version that you find too slow for your use case, my recommendation is to look for ways to improve its efficiency. I cannot—at least not with a straight face—recommend to instead turn to lower-level approaches. I have already seen too many talented Prolog programmers getting stuck in the morass of low-level language constructs, and never finding their way to actually improving the higher-level aspects so that they become both general and efficient enough for all use cases that are relevant in practice. This is where their talents would have been needed the most!

I would like to add a further answer to consider the task from an additional perspective.
This time, I would like to start with Jan's answer:
n_digits(N, Ds, Num) :-
length(Ds, N),
maplist(random_between(0'2, 0'9), Ds),
number_codes(Num, Ds).
Its primary merit is that it is very straight forward and also fast:
?- time(n_digits(1000, Ds, Num)).
% 6,007 inferences, 0.002 CPU in 0.002 seconds (92% CPU, 2736674 Lips)
In fact, it's so fast that it only works some of the time:
?- length(_, N), n_digits(3, Ds, 345).
N = 548,
Ds = [51, 52, 53] ;
N = 1309,
Ds = [51, 52, 53] ;
N = 1822,
Ds = [51, 52, 53] .
However, by this time we are already accustomed to the well-known fact that the correctness of solutions is only of secondary concern when compared to their performance, so let us continue as if the solution were correct.
We can quite directly map this to CLP(FD) constraints, by modifying the bold part as follows:
n_digits(N, Ds, Num) :-
length(Ds, N),
Ds ins 0'2..0'9,
labeling([random_value(0)], Ds),
number_codes(Num, Ds).
Does anyone have trouble understanding this? Please let me know, I will do what I can to make it clear to everyone who is interested. The best approach is to simply file a new question if anybody would like to know more.
For comparison, here is the previous query again, which illustrates that this predicate acts as we would expect from a relation:
?- length(_, N), n_digits(3, Ds, 345).
N = 0,
Ds = [51, 52, 53] ;
N = 1,
Ds = [51, 52, 53] ;
N = 2,
Ds = [51, 52, 53] .
In this concrete case, the cost of using CLP(FD) constraints is about one order of magnitude in performance, using a constraint solver that is not optimized for performance, mind you:
?- time(n_digits(1000, Ds, Num)).
% 134,580 inferences, 0.023 CPU in 0.026 seconds (87% CPU, 5935694 Lips)
I dare barely mention that the CLP(FD)-based version works more reliably, since this is of so little value in practice. Thus, depending on your use case, the overhead may well be prohibitive. Let us suppose that for this concrete case, we are willing to sacrifice 2 percent of a second in order to use CLP(FD) constraints.
Out of many possible examples, let us now consider the following slight variation of the task:
Let us describe lists of digits that satisfy all constraints, and are also palindromes.
Here is a possible declarative description of palindromes:
palindrome --> [].
palindrome --> [_].
palindrome --> [P], palindrome, [P].
With the CLP(FD)-based version, we can easily combine this constraint with the already stated ones:
n_digits(N, Ds, Num) :-
length(Ds, N),
Ds ins 0'2..0'9,
phrase(palindrome, Ds),
labeling([random_value(0)], Ds),
number_codes(Num, Ds).
Sample query:
?- time(n_digits(1000, Ds, Num)).
% 12,552,433 inferences, 1.851 CPU in 1.943 seconds (95% CPU, 6780005 Lips)
For comparison, how can we incorporate this straight-forward additional constraint into Jan's version? It is tempting to do it as follows:
n_digits(N, Ds, Num) :-
length(Ds, N),
phrase(palindrome, Ds),
maplist(random_between(0'2, 0'9), Ds),
number_codes(Num, Ds).
Pretty straight-forward too, right? The only problem is that it yields:
?- time(n_digits(1000, Ds, Num)).
% 4,013 inferences, 0.041 CPU in 0.046 seconds (88% CPU, 97880 Lips)
false.
Why is this the case?
And now good luck explaining this to any newcomer to the language! When arguing against the purported complexity of explaining CLP(FD)-based versions, please take into account the much, much higher complexity of explaining such purely procedural phenomena, which cannot be understood by declarative reasoning alone.
Further, good look with solving the task at all with low-level features in such a way that additional constraints can still be easily added.
By the way, sometimes, the query will actually succeed!
?- length(_, N), time(n_digits(3, Ds, Num)).
% 28 inferences, 0.000 CPU in 0.000 seconds (85% CPU, 848485 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 1400000 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (80% CPU, 1166667 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (79% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (82% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (82% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (82% CPU, 1473684 Lips)
% 28 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 1473684 Lips)
% 26 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 1444444 Lips)
N = 10,
Ds = [52, 50, 52],
Num = 424 .
I can only say: This approach cannot in honesty be recommended, can it?
If palindromes are too unrealistic for you, consider the following task:
Let us describe lists of digits without 0, 1 and 5.
Again, using CLP(FD), this is straight-forward:
n_digits(N, Ds, Num) :-
length(Ds, N),
Ds ins 0'2..0'9,
maplist(#\=(0'5), Ds),
labeling([random_value(0)], Ds),
number_codes(Num, Ds).
Sample query:
?- time(n_digits(1000, Ds, Num)).
% 203,529 inferences, 0.036 CPU in 0.038 seconds (93% CPU, 5662707 Lips)
For comparison, how would you do it without CLP(FD) constraints?
Now let us combine both requirements:
Let us describe lists of digits without 0, 1 and 5 that are also palindromes.
Again, using the CLP(FD) version, we can simply state the requirement:
n_digits(N, Ds, Num) :-
length(Ds, N),
Ds ins 0'2..0'9,
maplist(#\=(0'5), Ds),
phrase(palindrome, Ds),
labeling([random_value(0)], Ds),
number_codes(Num, Ds).
Sample query:
?- time(n_digits(1000, Ds, Num)).
% 20,117,000 inferences, 2.824 CPU in 2.905 seconds (97% CPU, 7123594 Lips)
These examples illustrate that with the general mechanism of constraints, you have actually somewhere to go in case you need slight variations of previous solutions. The code is quite straight-forward to adapt, and it scales reasonably well.
If there is anything that is too hard to understand, please file a question!
Now, as one last remark: Both CLP(FD)-based versions I posted earlier give you something that goes way beyond such a direct translation. With the other versions, you can post additional constraints not only on the list of digits, but also on the number itself! That's simply completely out of reach for any version that doesn't use constraints! Such constraints are also taken into account before the search for solutions even begins!

Related

Will this be tail call optimized in SWI-Prolog

step_n(0, I, I).
step_n(N, In, Out) :-
N > 0, plus(N1, 1, N), phase_step(In, T),
step_n(N1, T, Out).
phase_step is a function that transforms data.
Will this step_n run in almost the same memory as phase_step? If not, how should I rewrite it to do so? Will this depend on phase_step having a single solution?
EDIT: After some debugging using prolog_current_frame, I found out that if phase_step is a simple function like Out is In + 1, then optimization happens but not in my use case.
Why is TCO dependent on phase_step predicate?
Will this depend on phase_step having a single solution?
Kind of, but a bit stronger still: It depends on phase_step being deterministic, which means, not leaving any "choice points". A choice point is a future path to be explored; not necessarily one that will produce a further solution, but still something Prolog needs to check.
For example, this is deterministic:
phase_step_det(X, X).
It has a single solution, and Prolog does not prompt us for more:
?- phase_step_det(42, Out).
Out = 42.
The following has a single solution, but it is not deterministic:
phase_step_extrafailure(X, X).
phase_step_extrafailure(_X, _Y) :-
false.
After seeing the solution, there is still something Prolog needs to check. Even if we can tell by looking at the code that that something (the second clause) will fail:
?- phase_step_extrafailure(42, Out).
Out = 42 ;
false.
The following has more than one solution, so it is not deterministic:
phase_step_twosolutions(X, X).
phase_step_twosolutions(X, Y) :-
plus(X, 1, Y).
?- phase_step_twosolutions(42, Out).
Out = 42 ;
Out = 43.
Why is TCO dependent on phase_step predicate?
If there are further paths to be explored, then there must be data about those paths stored somewhere. That "somewhere" is some sort of stack data structure, and for every future path there must be a frame on the stack. This is why your memory usage grows. And with it, the computation time (the following uses copies of your step_n with my corresponding phase_step variants from above):
?- time(step_n_det(100_000, 42, Out)).
% 400,002 inferences, 0.017 CPU in 0.017 seconds (100% CPU, 24008702 Lips)
Out = 42 ;
% 7 inferences, 0.000 CPU in 0.000 seconds (87% CPU, 260059 Lips)
false.
?- time(step_n_extrafailure(100_000, 42, Out)).
% 400,000 inferences, 4.288 CPU in 4.288 seconds (100% CPU, 93282 Lips)
Out = 42 ;
% 100,005 inferences, 0.007 CPU in 0.007 seconds (100% CPU, 13932371 Lips)
false.
?- time(step_n_twosolutions(100_000, 42, Out)).
% 400,000 inferences, 4.231 CPU in 4.231 seconds (100% CPU, 94546 Lips)
Out = 42 ;
% 4 inferences, 0.007 CPU in 0.007 seconds (100% CPU, 548 Lips)
Out = 43 ;
% 8 inferences, 0.005 CPU in 0.005 seconds (100% CPU, 1612 Lips)
Out = 43 ;
% 4 inferences, 0.008 CPU in 0.008 seconds (100% CPU, 489 Lips)
Out = 44 ;
% 12 inferences, 0.003 CPU in 0.003 seconds (100% CPU, 4396 Lips)
Out = 43 ;
% 4 inferences, 0.009 CPU in 0.009 seconds (100% CPU, 451 Lips)
Out = 44 . % many further solutions
One way to explore this is using the SWI-Prolog debugger, which has a way of showing you alternatives (= choice points = future paths to be explored):
?- trace, step_n_det(5, 42, Out).
Call: (9) step_n_det(5, 42, _1496) ? skip % I typed 's' here.
Exit: (9) step_n_det(5, 42, 42) ? alternatives % I typed 'A' here.
[14] step_n_det(0, 42, 42)
Exit: (9) step_n_det(5, 42, 42) ? no debug % I typed 'n' here.
Out = 42 ;
false.
?- trace, step_n_extrafailure(5, 42, Out).
Call: (9) step_n_extrafailure(5, 42, _1500) ? skip
Exit: (9) step_n_extrafailure(5, 42, 42) ? alternatives
[14] step_n_extrafailure(0, 42, 42)
[14] phase_step_extrafailure(42, 42)
[13] phase_step_extrafailure(42, 42)
[12] phase_step_extrafailure(42, 42)
[11] phase_step_extrafailure(42, 42)
[10] phase_step_extrafailure(42, 42)
Exit: (9) step_n_extrafailure(5, 42, 42) ? no debug
Out = 42 ;
false.
All of those alternatives correspond to extra interpreter frames. If you use SWI-Prolog's visual debugger, it will also show you a graph representation of your stack, including all open choice points (though I've always found that hard to make sense of).
So if you want TCO and not grow the stack, you need your phase step to execute deterministically. You can do that by making the phase_step predicate itself deterministic. You can also put a cut after the phase_step call inside step_n.
Here are the calls from above with a cut after each phase_step:
?- time(step_n_det(100_000, 42, Out)).
% 400,001 inferences, 0.017 CPU in 0.017 seconds (100% CPU, 24204529 Lips)
Out = 42 ;
% 7 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 737075 Lips)
false.
?- time(step_n_extrafailure(100_000, 42, Out)).
% 400,000 inferences, 0.023 CPU in 0.023 seconds (100% CPU, 17573422 Lips)
Out = 42 ;
% 5 inferences, 0.000 CPU in 0.000 seconds (93% CPU, 220760 Lips)
false.
?- time(step_n_twosolutions(100_000, 42, Out)).
% 400,000 inferences, 0.023 CPU in 0.023 seconds (100% CPU, 17732727 Lips)
Out = 42 ;
% 5 inferences, 0.000 CPU in 0.000 seconds (94% CPU, 219742 Lips)
false.
Do not place cuts blindly, only once you understand where and why you really need them. Note how in the extrafailure case the cut only removes failures, but in the twosolutions case it removes actual solutions.
One helpful tool to understood code performance issues, notably unwanted non-determinism, is a ports profiler tool as found on e.g. ECLiPSe and Logtalk. The Logtalk ports_profiler tool is portable so we can use it here. We start by wrapping your code (from your gist link):
:- use_module(library(lists), []).
:- object(step).
:- public(step_n/3).
:- use_module(lists, [reverse/2]).
% pattern for the nth digit mth-coeffcient
digit_m(N, M, D) :-
divmod(M, N, Q, _), divmod(Q, 4, _, C),
(C = 0, D = 0; C = 1, D = 1; C = 2, D = 0; C = 3, D = -1).
calculate_digit_n(N, In, D) :-
calculate_digit_n_(N, In, D, 1, 0).
calculate_digit_n_(_, [], D, _, Acc) :- D1 is abs(Acc), divmod(D1, 10, _, D).
calculate_digit_n_(N, [I | Is], D, M, Acc) :-
digit_m(N, M, C), P is C*I, M1 is M+1, Acc1 is Acc+P,
calculate_digit_n_(N, Is, D, M1, Acc1).
phase_step(In, Out) :-
length(In, L), L1 is L + 1, phase_step_(In, Out, L1, 1, []).
phase_step_(_, Out, L, L, Acc) :- reverse(Out, Acc).
phase_step_(In, Out, L, N, Acc) :-
N < L, calculate_digit_n(N, In, D), N1 is N + 1,
phase_step_(In, Out, L, N1, [D | Acc]).
step_n(0, I, I).
step_n(N, In, Out) :-
prolog_current_frame(Fr), format('~w ', Fr),
N > 0, N1 is N - 1, phase_step(In, T),
step_n(N1, T, Out).
:- end_object.
%:- step_n(10, [1, 2, 3, 4, 5, 6, 7, 8], X).
And then (using SWI-Prolog as the backend as that is the Prolog system you told us you're using):
$ swilgt
...
?- {ports_profiler(loader)}.
% [ /Users/pmoura/logtalk/tools/ports_profiler/ports_profiler.lgt loaded ]
% [ /Users/pmoura/logtalk/tools/ports_profiler/loader.lgt loaded ]
% (0 warnings)
true.
?- logtalk_load(step, [debug(on), source_data(on)]).
% [ /Users/pmoura/step.pl loaded ]
% (0 warnings)
true.
?- step::step_n(10, [1, 2, 3, 4, 5, 6, 7, 8], X).
340 15578 30816 46054 61292 76530 91768 107006 122244 137482
X = [3, 6, 4, 4, 0, 6, 7, 8] .
?- ports_profiler::data.
------------------------------------------------------------------------------
Entity Predicate Fact Rule Call Exit *Exit Fail Redo Error
------------------------------------------------------------------------------
step calculate_digit_n/3 0 80 80 0 80 0 0 0
step calculate_digit_n_/5 0 720 720 0 720 0 0 0
step digit_m/3 0 640 640 40 600 0 0 0
step phase_step/2 0 10 10 0 10 0 0 0
step phase_step_/5 0 90 90 0 90 0 0 0
step step_n/3 1 10 11 0 11 0 0 0
------------------------------------------------------------------------------
true.
The *Exit column is for non-deterministic exists from the procedure box. For help with the tool and with interpreting the table results, see https://logtalk.org/manuals/devtools/ports_profiler.html But is clear by just a glance to the table that both phase_step/2 and step_n/3 are non-deterministic.
Update
Note that tail call optimization (TCO) doesn't mean or require the predicate to be deterministic. In your case, TCO can be applied by a Prolog compiler as the last call in the rule for the step_n/3 predicate is call to itself. That means that a stack frame can be saved on that specific recursive call. It doesn't mean that there are no choice-points being created by what precedes the recursive call. Using once/1 (as you mention on the comments) simply discards the choice-point created when phase_step/2 is called as that predicate itself is non-deterministic. That's what the table shows. The step_n/3 predicate is also non-deterministic and thus calling it creates a choice-point when the first argument is 0, which happens when you call the predicate with a zero on the first argument or when the proof for the query reaches the base case on this recursive definition.

Prolog returning only the max value from all iterations

I have a method that returns me a number on all iterations, but now I need to returns only the max value from all of the iterations that were done.
find_max(X, Y):-
find_number(X, Y).
So the find_number() returns only 1 number and some text alongside it. So for example if I were to ran it I would get this output:
X = 1, Y = me;
X = 5, Y = you;
X = 6, Y = he;
And the only output I need to return is the X = 6, Y = he;.
I am using SWI-Prolog.
A more portable alternative to the library(aggregate) posted by Willem, as the library is only available in a few Prolog systems, is:
find_max_alt(Xm, Ym) :-
setof(max(X, Y), find_number(X, Y), Solutions),
reverse(Solutions, [max(Xm, Ym)| _]).
This solution also appears to required a smaller number of inferences. Using the data in the question, we get:
?- time(find_max(Xm, Ym)).
% 40 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 800000 Lips)
Xm = 6,
Ym = he.
Versus:
?- time(find_max_alt(Xm, Ym)).
% 25 inferences, 0.000 CPU in 0.000 seconds (76% CPU, 675676 Lips)
Xm = 6,
Ym = he.
The setof/3 predicate is a standard predicate. The reverse/2 predicate is a common list predicate (and much simpler to define than the predicates in the aggregate library.
You can use the aggregate library for that:
:- use_module(library(aggregate)).
find_max(Xm, Ym):-
aggregate(max(X, Y), find_number(X, Y), max(Xm, Ym)).

On solving project Euler #303 in with Prolog / clpfd

Here comes Project Euler Problem 303, "Multiples with small digits".
For a positive integer n, define f(n) as the least positive multiple of n that, written in base 10, uses only digits ≤ 2.
Thus f(2)=2, f(3)=12, f(7)=21, f(42)=210, f(89)=1121222.
Also, .
Find .
This is the code I have already written / that I want to improve:
:- use_module(library(clpfd)).
n_fn(N,FN) :-
F #> 0,
FN #= F*N,
length(Ds, _),
digits_number(Ds, FN),
Ds ins 0..2,
labeling([min(FN)], Ds).
That code already works for solving a small number of small problem instances:
?- n_fn(2,X).
X = 2
?- n_fn(3,X).
X = 12
?- n_fn(7,X).
X = 21
?- n_fn(42,X).
X = 210
?- n_fn(89,X).
X = 1121222
What can I do to tackle above challenge "find: sum(n=1 to 10000)(f(n)/n)"?
How can I solve more and bigger instances in reasonable time?
Please share your ideas with me! Thank you in advance!
It is slow on 9's and there is a pattern..
so..
n_fn(9,12222):-!.
n_fn(99,1122222222):-!.
n_fn(999,111222222222222):-!.
n_fn(9999,11112222222222222222):-!.
But i'm sure it would be nicer to have the prolog find this patten and adapt the search.. not sure how you would do that though!
In general it must be recalculating a lot of results..
I cannot spot a recurrence relation for this problem. So, initially I was thinking that memoizing could speed it up. Not really...
This code, clp(fd) based, is marginally faster than your...
n_fn_d(N,FN) :-
F #> 0,
FN #= F*N,
digits_number_d([D|Ds], Ts),
D in 1..2,
Ds ins 0..2,
scalar_product(Ts, [D|Ds], #=, FN),
labeling([min(FN)], [D|Ds]).
digits_number_d([_], [1]).
digits_number_d([_|Ds], [T,H|Ts]) :-
digits_number_d(Ds, [H|Ts]), T #= H*10.
When I used clp(fd) to solve problems from Euler, I stumbled in poor performance... sometime the simpler 'generate and test' paired with native arithmetic make a difference.
This simpler one, 'native' based:
n_fn_e(N,FN) :-
digits_e(FN),
0 =:= FN mod N.
digits_e(N) :-
length([X|Xs], _),
maplist(code_e, [X|Xs]), X \= 0'0,
number_codes(N, [X|Xs]).
code_e(0'0).
code_e(0'1).
code_e(0'2).
it's way faster:
test(N) :-
time(n_fn(N,A)),
time(n_fn_d(N,B)),
time(n_fn_e(N,C)),
writeln([A,B,C]).
?- test(999).
% 473,671,146 inferences, 175.006 CPU in 182.242 seconds (96% CPU, 2706593 Lips)
% 473,405,175 inferences, 173.842 CPU in 178.071 seconds (98% CPU, 2723188 Lips)
% 58,724,230 inferences, 25.749 CPU in 26.039 seconds (99% CPU, 2280636 Lips)
[111222222222222,111222222222222,111222222222222]
true

Prolog: calculating OEIS A031877 ("nontrivial reversal numbers") using clp(FD)

Browsing through the awesome On-Line Encyclopedia of Integer Sequences (cf. en.wikipedia.org), I stumbled upon the following integer sequence:
A031877: Nontrivial reversal numbers (numbers which are integer multiples of their reversals), excluding palindromic numbers and multiples of 10.
By re-using some code I wrote for my answer to the related question "Faster implementation of verbal arithmetic in Prolog" I could
write down a solution quite effortlessly—thanks to clpfd!
:- use_module(library(clpfd)).
We define the core relation a031877_ndigits_/3 based on
digits_number/2 (defined earlier):
a031877_ndigits_(Z_big,N_digits,[K,Z_small,Z_big]) :-
K #> 1,
length(D_big,N_digits),
reverse(D_small,D_big),
digits_number(D_big,Z_big),
digits_number(D_small,Z_small),
Z_big #= Z_small * K.
The core relation is deterministic and terminates universally whenever
N_digit is a concrete integer. See for yourself for the first 100 values of N_digit!
?- time((N in 0..99,indomain(N),a031877_ndigits_(Z,N,Zs),false)).
% 3,888,222 inferences, 0.563 CPU in 0.563 seconds (100% CPU, 6903708 Lips)
false
Let's run some queries!
?- a031877_ndigits_(87912000000087912,17,_).
true % succeeds, as expected
; false.
?- a031877_ndigits_(87912000000987912,17,_).
false. % fails, as expected
Next, let's find some non-trivial reversal numbers comprising exactly four decimal-digits:
?- a031877_ndigits_(Z,4,Zs), labeling([],Zs).
Z = 8712, Zs = [4,2178,8712]
; Z = 9801, Zs = [9,1089,9801]
; false.
OK! Let's measure the runtime needed to prove universal termination of above query!
?- time((a031877_ndigits_(Z,4,Zs),labeling([],Zs),false)).
% 11,611,502 inferences, 3.642 CPU in 3.641 seconds (100% CPU, 3188193 Lips)
false. % terminates universally
Now, that's way too long!
What can I do to speed things up? Use different and/or other constraints? Maybe even redundant ones? Or maybe identify and eliminate symmetries which slash the search space size? What about different clp(*) domains (b,q,r,set)? Or different consistency/propagation techniques? Or rather Prolog style coroutining?
Got ideas? I want them all! Thanks in advance.
So far ... no answers:(
I came up with the following...
How about using different variables for labeling/2?
a031877_ndigitsNEW_(Z_big,N_digits,/* [K,Z_small,Z_big] */
[K|D_big]) :-
K #> 1,
length(D_big,N_digits),
reverse(D_small,D_big),
digits_number(D_big,Z_big),
digits_number(D_small,Z_small),
Z_big #= Z_small * K.
Let's measure some runtimes!
?- time((a031877_ndigits_(Z,4,Zs),labeling([ff],Zs),false)).
% 14,849,250 inferences, 4.545 CPU in 4.543 seconds (100% CPU, 3267070 Lips)
false.
?- time((a031877_ndigitsNEW_(Z,4,Zs),labeling([ff],Zs),false)).
% 464,917 inferences, 0.052 CPU in 0.052 seconds (100% CPU, 8962485 Lips)
false.
Better! But can we go further?
?- time((a031877_ndigitsNEW_(Z,5,Zs),labeling([ff],Zs),false)).
% 1,455,670 inferences, 0.174 CPU in 0.174 seconds (100% CPU, 8347374 Lips)
false.
?- time((a031877_ndigitsNEW_(Z,6,Zs),labeling([ff],Zs),false)).
% 5,020,125 inferences, 0.614 CPU in 0.613 seconds (100% CPU, 8181572 Lips)
false.
?- time((a031877_ndigitsNEW_(Z,7,Zs),labeling([ff],Zs),false)).
% 15,169,630 inferences, 1.752 CPU in 1.751 seconds (100% CPU, 8657015 Lips)
false.
There is still lots of room for improvement, for sure! There must be...
We can do better by translating number-theoretic properties into the language of constraints!
All terms are of the form 87...12 = 4*21...78 or 98...01 = 9*10...89.
We implement a031877_ndigitsNEWER_/3 based on a031877_ndigitsNEW_/3 and directly add above property as two finite-domain constraints:
a031877_ndigitsNEWER_(Z_big,N_digits,[K|D_big]) :-
K in {4}\/{9}, % (new)
length(D_big,N_digits),
D_big ins (0..2)\/(7..9), % (new)
reverse(D_small,D_big),
digits_number(D_big,Z_big),
digits_number(D_small,Z_small),
Z_big #= Z_small * K.
Let's re-run the benchmarks we used before!
?- time((a031877_ndigitsNEWER_(Z,5,Zs),labeling([ff],Zs),false)).
% 73,011 inferences, 0.006 CPU in 0.006 seconds (100% CPU, 11602554 Lips)
false.
?- time((a031877_ndigitsNEWER_(Z,6,Zs),labeling([ff],Zs),false)).
% 179,424 inferences, 0.028 CPU in 0.028 seconds (100% CPU, 6399871 Lips)
false.
?- time((a031877_ndigitsNEWER_(Z,7,Zs),labeling([ff],Zs),false)).
% 348,525 inferences, 0.037 CPU in 0.037 seconds (100% CPU, 9490920 Lips)
false.
Summary: For the three queries, we consistently observed a significant reduction of search required. Just consider how much the inference counts shrank: 1.45M -> 73k, 5M -> 179k, 15.1M -> 348k.
Can we do even better (while preserving declarativity of the code)? I don't know, I guess so...

Reversible numerical calculations in Prolog

While reading SICP I came across logic programming chapter 4.4. Then I started looking into the Prolog programming language and tried to understand some simple assignments in Prolog. I found that Prolog seems to have troubles with numerical calculations.
Here is the computation of a factorial in standard Prolog:
f(0, 1).
f(A, B) :- A > 0, C is A-1, f(C, D), B is A*D.
The issues I find is that I need to introduce two auxiliary variables (C and D), a new syntax (is) and that the problem is non-reversible (i.e., f(5,X) works as expected, but f(X,120) does not).
Naively, I expect that at the very least C is A-1, f(C, D) above may be replaced by f(A-1,D), but even that does not work.
My question is: Why do I need to do this extra "stuff" in numerical calculations but not in other queries?
I do understand (and SICP is quite clear about it) that in general information on "what to do" is insufficient to answer the question of "how to do it". So the declarative knowledge in (at least some) math problems is insufficient to actually solve these problems. But that begs the next question: How does this extra "stuff" in Prolog help me to restrict the formulation to just those problems where "what to do" is sufficient to answer "how to do it"?
is/2 is very low-level and limited. As you correctly observe, it cannot be used in all directions and is therefore not a true relation.
For reversible arithmetic, use your Prolog system's constraint solvers.
For example, SWI-Prolog's CLP(FD) manual contains the following definition of n_factorial/2:
:- use_module(library(clpfd)).
n_factorial(0, 1).
n_factorial(N, F) :- N #> 0, N1 #= N - 1, F #= N * F1, n_factorial(N1, F1).
The following example queries show that it can be used in all directions:
?- n_factorial(47, F).
F = 258623241511168180642964355153611979969197632389120000000000 ;
false.
?- n_factorial(N, 1).
N = 0 ;
N = 1 ;
false.
?- n_factorial(N, 3).
false.
Of course, this definition still relies on unification, and you can therefore not plug in arbitrary integer expressions. A term like 2-2 (which is -(2,2) in prefix notation) does not unfiy with 0. But you can easily allow this if you rewrite this to:
:- use_module(library(clpfd)).
n_factorial(N, F) :- N #= 0, F #= 1.
n_factorial(N, F) :- N #> 0, N1 #= N - 1, F #= N * F1, n_factorial(N1, F1).
Example query and its result:
?- n_factorial(2-2, -4+5).
true .
Forget about variables and think that A and B - is just a name for value which can be placed into that clause (X :- Y). to make it reachable. Think about X = (2 + (3 * 4)) in the way of data structures which represent mathematical expression. If you will ask prolog to reach goal f(A-1, B) it will try to find such atom f(A-1,B). or a rule (f(A-1,B) :- Z), Z. which will be unified to "success".
is/2 tries to unify first argument with result of interpreting second argument as an expression. Consider eval/2 as variant of is/2:
eval(0, 1-1). eval(0, 2-2). eval(1,2-1).
eval(Y, X-0):- eval(Y, X).
eval(Y, A+B):- eval(ValA, A), eval(ValB, B), eval(Y, ValA + ValB).
eval(4, 2*2).
eval(0, 0*_). eval(0, _*0).
eval(Y, X*1):- eval(Y, X).
eval(Y, 1*X):- eval(Y, X).
eval(Y, A*B):- eval(ValA, A), eval(ValB, B), eval(Y, ValA * ValB).
The reason why f(X,120) doesn't work is simple >/2 works only when its arguments is bound (i.e. you can't compare something not yet defined like X with anything else). To fix that you have to split that rule into:
f(A,B) :- nonvar(A), A > 0, C is A-1, f(C, D), B is A*D.
f(A,B) :- nonvar(B), f_rev(A, B, 1, 1).
% f_rev/4 - only first argument is unbound.
f_rev(A, B, A, B). % solution
f_rev(A, B, N, C):- C < B, NextN is (N+1), NextC is (C*NextN), f_rev(A, B, NextN, NextC).
Update: (fixed f_rev/4)
You may be interested in finite-domain solver. There was a question about using such things. By using #>/2 and #=/2 you can describe some formula and restrictions and then resolve them. But these predicates uses special abilities of some prolog systems which allows to associate name with some attributes which may help to narrow set of possible values by intersection of restriction. Some other systems (usually the same) allows you to reorder sequence of processing goals ("suspend").
Also member(X,[1,2,3,4,5,6,7]), f(X, 120) is probably doing the same thing what your "other queries" do.
If you are interested in logical languages in general you may also look at Curry language (there all non-pure functions is "suspended" until not-yed-defined value is unified).
In this answer we use clpfd, just like this previous answer did.
:- use_module(library(clpfd)).
For easy head-to-head comparison (later on), we call the predicate presented here n_fac/2:
n_fac(N_expr,F_expr) :-
N #= N_expr, % eval arith expr
F #= F_expr, % eval arith expr
n_facAux(N,F).
Like in this previous answer, n_fac/2 admits the use of arithmetic expressions.
n_facAux(0,1). % 0! = 1
n_facAux(1,1). % 1! = 1
n_facAux(2,2). % 2! = 2
n_facAux(N,F) :-
N #> 2,
F #> N, % redundant constraint
% to help `n_fac(N,N)` terminate
n0_n_fac0_fac(3,N,6,F). % general case starts with "3! = 6"
The helper predicate n_facAux/2 delegates any "real" work to n0_n_fac0_fac/4:
n0_n_fac0_fac(N ,N,F ,F).
n0_n_fac0_fac(N0,N,F0,F) :-
N0 #< N,
N1 #= N0+1, % count "up", not "down"
F1 #= F0*N1, % calc `1*2*...*N`, not `N*(N-1)*...*2*1`
F1 #=< F, % enforce redundant constraint
n0_n_fac0_fac(N1,N,F1,F).
Let's compare n_fac/2 and n_factorial/2!
?- n_factorial(47,F).
F = 258623241511168180642964355153611979969197632389120000000000
; false.
?- n_fac(47,F).
F = 258623241511168180642964355153611979969197632389120000000000
; false.
?- n_factorial(N,1).
N = 0
; N = 1
; false.
?- n_fac(N,1).
N = 0
; N = 1
; false.
?- member(F,[3,1_000_000]), ( n_factorial(N,F) ; n_fac(N,F) ).
false. % both predicates agree
OK! Identical, so far... Why not do a little brute-force testing?
?- time((F1 #\= F2,n_factorial(N,F1),n_fac(N,F2))).
% 57,739,784 inferences, 6.415 CPU in 7.112 seconds (90% CPU, 9001245 Lips)
% Execution Aborted
?- time((F1 #\= F2,n_fac(N,F2),n_factorial(N,F1))).
% 52,815,182 inferences, 5.942 CPU in 6.631 seconds (90% CPU, 8888423 Lips)
% Execution Aborted
?- time((N1 #> 1,N2 #> 1,N1 #\= N2,n_fac(N1,F),n_factorial(N2,F))).
% 99,463,654 inferences, 15.767 CPU in 16.575 seconds (95% CPU, 6308401 Lips)
% Execution Aborted
?- time((N1 #> 1,N2 #> 1,N1 #\= N2,n_factorial(N2,F),n_fac(N1,F))).
% 187,621,733 inferences, 17.192 CPU in 18.232 seconds (94% CPU, 10913552 Lips)
% Execution Aborted
No differences for the first few hundred values of N in 2..sup... Good!
Moving on: How about the following (suggested in a comment to this answer)?
?- n_factorial(N,N), false.
false.
?- n_fac(N,N), false.
false.
Doing fine! Identical termination behaviour... More?
?- N #< 5, n_factorial(N,_), false.
false.
?- N #< 5, n_fac(N,_), false.
false.
?- F in 10..100, n_factorial(_,F), false.
false.
?- F in 10..100, n_fac(_,F), false.
false.
Alright! Still identical termination properties! Let's dig a little deeper! How about the following?
?- F in inf..10, n_factorial(_,F), false.
... % Execution Aborted % does not terminate universally
?- F in inf..10, n_fac(_,F), false.
false. % terminates universally
D'oh! The first query does not terminate, the second does.
What a speedup! :)
Let's do some empirical runtime measurements!
?- member(Exp,[6,7,8,9]), F #= 10^Exp, time(n_factorial(N,F)) ; true.
% 328,700 inferences, 0.043 CPU in 0.043 seconds (100% CPU, 7660054 Lips)
% 1,027,296 inferences, 0.153 CPU in 0.153 seconds (100% CPU, 6735634 Lips)
% 5,759,864 inferences, 1.967 CPU in 1.967 seconds (100% CPU, 2927658 Lips)
% 22,795,694 inferences, 23.911 CPU in 23.908 seconds (100% CPU, 953351 Lips)
true.
?- member(Exp,[6,7,8,9]), F #= 10^Exp, time(n_fac(N,F)) ; true.
% 1,340 inferences, 0.000 CPU in 0.000 seconds ( 99% CPU, 3793262 Lips)
% 1,479 inferences, 0.000 CPU in 0.000 seconds (100% CPU, 6253673 Lips)
% 1,618 inferences, 0.000 CPU in 0.000 seconds (100% CPU, 5129994 Lips)
% 1,757 inferences, 0.000 CPU in 0.000 seconds (100% CPU, 5044792 Lips)
true.
Wow! Some more?
?- member(U,[10,100,1000]), time((N in 1..U,n_factorial(N,_),false)) ; true.
% 34,511 inferences, 0.004 CPU in 0.004 seconds (100% CPU, 9591041 Lips)
% 3,091,271 inferences, 0.322 CPU in 0.322 seconds (100% CPU, 9589264 Lips)
% 305,413,871 inferences, 90.732 CPU in 90.721 seconds (100% CPU, 3366116 Lips)
true.
?- member(U,[10,100,1000]), time((N in 1..U,n_fac(N,_),false)) ; true.
% 3,729 inferences, 0.001 CPU in 0.001 seconds (100% CPU, 2973653 Lips)
% 36,369 inferences, 0.004 CPU in 0.004 seconds (100% CPU, 10309784 Lips)
% 362,471 inferences, 0.036 CPU in 0.036 seconds (100% CPU, 9979610 Lips)
true.
The bottom line?
The code presented in this answer is as low-level as you should go: Forget is/2!
Redundant constraints can and do pay off.
The order of arithmetic operations (counting "up" vs "down") can make quite a difference, too.
If you want to calculate the factorial of some "large" N, consider using a different approach.
Use clpfd!
There are some things which you must remember when looking at Prolog:
There is no implicit return value when you call a predicate. If you want to get a value out of a call you need to add extra arguments which can be used to "return" values, the second argument in your f/2 predicate. While being more verbose it does have the benefit of being easy to return many values.
This means that automatically "evaluating" arguments in a call is really quite meaningless as there is no value to return and it is not done. So there are no nested calls, in this respect Prolog is flat. So when you call f(A-1, D) the first argument to f/2 is the structure A-1, or really -(A, 1) as - is an infix operator. So if you want to get the value from a call to foo into a call to bar you have to explicitly use a variable to do it like:
foo(..., X), bar(X, ...),
So you need a special predicate which forces arithmetic evaluation, is/2. It's second argument is a structure representing an arithmetic expression which it interprets, evaluates and unifies the result with its first argument, which can be either a variable or numerical value.
While in principle you can run things backwards with most things you can't. Usually it is only simple predicates working on structures for which it is possible, though there are some very useful cases where it is possible. is/2 doesn't work backwards, it would be exceptional if it did.
This is why you need the extra variables C and D and can't replace C is A-1, f(C, D) by f(A-1,D).
(Yes I know you don't make calls in Prolog, but evaluate goals, but we were starting from a functional viewpoint here)

Resources