step_n(0, I, I).
step_n(N, In, Out) :-
N > 0, plus(N1, 1, N), phase_step(In, T),
step_n(N1, T, Out).
phase_step is a function that transforms data.
Will this step_n run in almost the same memory as phase_step? If not, how should I rewrite it to do so? Will this depend on phase_step having a single solution?
EDIT: After some debugging using prolog_current_frame, I found out that if phase_step is a simple function like Out is In + 1, then optimization happens but not in my use case.
Why is TCO dependent on phase_step predicate?
Will this depend on phase_step having a single solution?
Kind of, but a bit stronger still: It depends on phase_step being deterministic, which means, not leaving any "choice points". A choice point is a future path to be explored; not necessarily one that will produce a further solution, but still something Prolog needs to check.
For example, this is deterministic:
phase_step_det(X, X).
It has a single solution, and Prolog does not prompt us for more:
?- phase_step_det(42, Out).
Out = 42.
The following has a single solution, but it is not deterministic:
phase_step_extrafailure(X, X).
phase_step_extrafailure(_X, _Y) :-
false.
After seeing the solution, there is still something Prolog needs to check. Even if we can tell by looking at the code that that something (the second clause) will fail:
?- phase_step_extrafailure(42, Out).
Out = 42 ;
false.
The following has more than one solution, so it is not deterministic:
phase_step_twosolutions(X, X).
phase_step_twosolutions(X, Y) :-
plus(X, 1, Y).
?- phase_step_twosolutions(42, Out).
Out = 42 ;
Out = 43.
Why is TCO dependent on phase_step predicate?
If there are further paths to be explored, then there must be data about those paths stored somewhere. That "somewhere" is some sort of stack data structure, and for every future path there must be a frame on the stack. This is why your memory usage grows. And with it, the computation time (the following uses copies of your step_n with my corresponding phase_step variants from above):
?- time(step_n_det(100_000, 42, Out)).
% 400,002 inferences, 0.017 CPU in 0.017 seconds (100% CPU, 24008702 Lips)
Out = 42 ;
% 7 inferences, 0.000 CPU in 0.000 seconds (87% CPU, 260059 Lips)
false.
?- time(step_n_extrafailure(100_000, 42, Out)).
% 400,000 inferences, 4.288 CPU in 4.288 seconds (100% CPU, 93282 Lips)
Out = 42 ;
% 100,005 inferences, 0.007 CPU in 0.007 seconds (100% CPU, 13932371 Lips)
false.
?- time(step_n_twosolutions(100_000, 42, Out)).
% 400,000 inferences, 4.231 CPU in 4.231 seconds (100% CPU, 94546 Lips)
Out = 42 ;
% 4 inferences, 0.007 CPU in 0.007 seconds (100% CPU, 548 Lips)
Out = 43 ;
% 8 inferences, 0.005 CPU in 0.005 seconds (100% CPU, 1612 Lips)
Out = 43 ;
% 4 inferences, 0.008 CPU in 0.008 seconds (100% CPU, 489 Lips)
Out = 44 ;
% 12 inferences, 0.003 CPU in 0.003 seconds (100% CPU, 4396 Lips)
Out = 43 ;
% 4 inferences, 0.009 CPU in 0.009 seconds (100% CPU, 451 Lips)
Out = 44 . % many further solutions
One way to explore this is using the SWI-Prolog debugger, which has a way of showing you alternatives (= choice points = future paths to be explored):
?- trace, step_n_det(5, 42, Out).
Call: (9) step_n_det(5, 42, _1496) ? skip % I typed 's' here.
Exit: (9) step_n_det(5, 42, 42) ? alternatives % I typed 'A' here.
[14] step_n_det(0, 42, 42)
Exit: (9) step_n_det(5, 42, 42) ? no debug % I typed 'n' here.
Out = 42 ;
false.
?- trace, step_n_extrafailure(5, 42, Out).
Call: (9) step_n_extrafailure(5, 42, _1500) ? skip
Exit: (9) step_n_extrafailure(5, 42, 42) ? alternatives
[14] step_n_extrafailure(0, 42, 42)
[14] phase_step_extrafailure(42, 42)
[13] phase_step_extrafailure(42, 42)
[12] phase_step_extrafailure(42, 42)
[11] phase_step_extrafailure(42, 42)
[10] phase_step_extrafailure(42, 42)
Exit: (9) step_n_extrafailure(5, 42, 42) ? no debug
Out = 42 ;
false.
All of those alternatives correspond to extra interpreter frames. If you use SWI-Prolog's visual debugger, it will also show you a graph representation of your stack, including all open choice points (though I've always found that hard to make sense of).
So if you want TCO and not grow the stack, you need your phase step to execute deterministically. You can do that by making the phase_step predicate itself deterministic. You can also put a cut after the phase_step call inside step_n.
Here are the calls from above with a cut after each phase_step:
?- time(step_n_det(100_000, 42, Out)).
% 400,001 inferences, 0.017 CPU in 0.017 seconds (100% CPU, 24204529 Lips)
Out = 42 ;
% 7 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 737075 Lips)
false.
?- time(step_n_extrafailure(100_000, 42, Out)).
% 400,000 inferences, 0.023 CPU in 0.023 seconds (100% CPU, 17573422 Lips)
Out = 42 ;
% 5 inferences, 0.000 CPU in 0.000 seconds (93% CPU, 220760 Lips)
false.
?- time(step_n_twosolutions(100_000, 42, Out)).
% 400,000 inferences, 0.023 CPU in 0.023 seconds (100% CPU, 17732727 Lips)
Out = 42 ;
% 5 inferences, 0.000 CPU in 0.000 seconds (94% CPU, 219742 Lips)
false.
Do not place cuts blindly, only once you understand where and why you really need them. Note how in the extrafailure case the cut only removes failures, but in the twosolutions case it removes actual solutions.
One helpful tool to understood code performance issues, notably unwanted non-determinism, is a ports profiler tool as found on e.g. ECLiPSe and Logtalk. The Logtalk ports_profiler tool is portable so we can use it here. We start by wrapping your code (from your gist link):
:- use_module(library(lists), []).
:- object(step).
:- public(step_n/3).
:- use_module(lists, [reverse/2]).
% pattern for the nth digit mth-coeffcient
digit_m(N, M, D) :-
divmod(M, N, Q, _), divmod(Q, 4, _, C),
(C = 0, D = 0; C = 1, D = 1; C = 2, D = 0; C = 3, D = -1).
calculate_digit_n(N, In, D) :-
calculate_digit_n_(N, In, D, 1, 0).
calculate_digit_n_(_, [], D, _, Acc) :- D1 is abs(Acc), divmod(D1, 10, _, D).
calculate_digit_n_(N, [I | Is], D, M, Acc) :-
digit_m(N, M, C), P is C*I, M1 is M+1, Acc1 is Acc+P,
calculate_digit_n_(N, Is, D, M1, Acc1).
phase_step(In, Out) :-
length(In, L), L1 is L + 1, phase_step_(In, Out, L1, 1, []).
phase_step_(_, Out, L, L, Acc) :- reverse(Out, Acc).
phase_step_(In, Out, L, N, Acc) :-
N < L, calculate_digit_n(N, In, D), N1 is N + 1,
phase_step_(In, Out, L, N1, [D | Acc]).
step_n(0, I, I).
step_n(N, In, Out) :-
prolog_current_frame(Fr), format('~w ', Fr),
N > 0, N1 is N - 1, phase_step(In, T),
step_n(N1, T, Out).
:- end_object.
%:- step_n(10, [1, 2, 3, 4, 5, 6, 7, 8], X).
And then (using SWI-Prolog as the backend as that is the Prolog system you told us you're using):
$ swilgt
...
?- {ports_profiler(loader)}.
% [ /Users/pmoura/logtalk/tools/ports_profiler/ports_profiler.lgt loaded ]
% [ /Users/pmoura/logtalk/tools/ports_profiler/loader.lgt loaded ]
% (0 warnings)
true.
?- logtalk_load(step, [debug(on), source_data(on)]).
% [ /Users/pmoura/step.pl loaded ]
% (0 warnings)
true.
?- step::step_n(10, [1, 2, 3, 4, 5, 6, 7, 8], X).
340 15578 30816 46054 61292 76530 91768 107006 122244 137482
X = [3, 6, 4, 4, 0, 6, 7, 8] .
?- ports_profiler::data.
------------------------------------------------------------------------------
Entity Predicate Fact Rule Call Exit *Exit Fail Redo Error
------------------------------------------------------------------------------
step calculate_digit_n/3 0 80 80 0 80 0 0 0
step calculate_digit_n_/5 0 720 720 0 720 0 0 0
step digit_m/3 0 640 640 40 600 0 0 0
step phase_step/2 0 10 10 0 10 0 0 0
step phase_step_/5 0 90 90 0 90 0 0 0
step step_n/3 1 10 11 0 11 0 0 0
------------------------------------------------------------------------------
true.
The *Exit column is for non-deterministic exists from the procedure box. For help with the tool and with interpreting the table results, see https://logtalk.org/manuals/devtools/ports_profiler.html But is clear by just a glance to the table that both phase_step/2 and step_n/3 are non-deterministic.
Update
Note that tail call optimization (TCO) doesn't mean or require the predicate to be deterministic. In your case, TCO can be applied by a Prolog compiler as the last call in the rule for the step_n/3 predicate is call to itself. That means that a stack frame can be saved on that specific recursive call. It doesn't mean that there are no choice-points being created by what precedes the recursive call. Using once/1 (as you mention on the comments) simply discards the choice-point created when phase_step/2 is called as that predicate itself is non-deterministic. That's what the table shows. The step_n/3 predicate is also non-deterministic and thus calling it creates a choice-point when the first argument is 0, which happens when you call the predicate with a zero on the first argument or when the proof for the query reaches the base case on this recursive definition.
Related
I have a rule that matches bc. When I encounter that in a string, I don't want to parse that string, otherwise parse anything else.
% Prolog
bc(B, C) --> [B, C], {
B = "b",
C = "c"
}.
not_bc(O) --> [O], % ?! bc(O, C).
% ?- phrase(not_bc(O), "bcdefg").
% false.
% ?- phrase(not_bc(O), "abcdefg").
% O = "a".
% ?- phrase(not_bc(O), "wxcybgz")
% O = "w".
% ?- phrase(not_bc(O), "wxybgz")
% O = "w".
Simplified version of my problem, hopefully solutions are isomorphic.
Similar to this question:
Translation to DCG Semicontext not working - follow on
An alternative:
process_bc(_) --> "bc", !, { fail }.
process_bc(C) --> [C].
This differs from my other solution in accepting:
?- time(phrase(process_bc(C), `b`, _)).
% 8 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 387053 Lips)
C = 98.
In swi-prolog:
process_text(C1) --> [C1, C2], { dif([C1, C2], `bc`) }.
Results:
?- time(phrase(process_text(C), `bca`, _)).
% 11 inferences, 0.000 CPU in 0.000 seconds (79% CPU, 376790 Lips)
false.
?- time(phrase(process_text(C), `bd`, _)).
% 10 inferences, 0.000 CPU in 0.000 seconds (80% CPU, 353819 Lips)
C = 98.
?- time(phrase(process_text(C), `zbcagri4gj40w9tu4tu34ty3ty3478t348t`, _)).
% 10 inferences, 0.000 CPU in 0.000 seconds (80% CPU, 372717 Lips)
C = 122.
A single character, or no characters, are both presumably meant to be failures.
This is nicely efficient, only having to check the first 2 characters.
I am new to this language and am having trouble coming up with a solution to this problem. The program must implement the following cases.
Both variables are instantiated:
pivot( [1,2,3,4,5,6,7], [5,6,7,4,1,2,3] ).`
yields a true/yes result.
Only Before is instantiated:
pivot( [1,2,3,4,5,6], R ).
unifies R = [4,5,6,1,2,3] as its one result.
Only After is instantiated:
pivot(L, [1,2]).
unifies L = [2,1] as its one result.
Neither variable is instantiated:
pivot(L, R).
is undefined (since results are generated arbitrarily).
If by pivot, you mean to split the list in 2 and swap the halves, then something like this would work.
First, consider the normal case: If you have an instantiated list, pivoting it is trivial. You just need to
figure out half the length of the list
break it up into
a prefix, consisting of that many items, and
a suffix, consisting of whatever is left over
concatenate those two lists in reverse order
Once you have that, everything else is just a matter of deciding which variable is bound and using that as the source list.
It is a common Prolog idiom to have a single "public" predicate that invokes a "private" worker predicate that does the actual work.
Given that the problem statement requires that at least one of the two variable in your pivot/2 must be instantiated, we can define our public predicate along these lines:
pivot( Ls , Rs ) :- nonvar(Ls), !, pivot0(Ls,Rs) .
pivot( Ls , Rs ) :- nonvar(Rs), !, pivot0(Rs,Ls) .
If Ls is bound, we invoke the worker, pivot0/2 with the arguments as-is. But if Ls is unbound, and Rs is bound, we invoke it with the arguments reversed. The cuts (!) are there to prevent the predicate from succeeding twice if invoked with both arguments bound (pivot([a,b,c],[a,b,c]).).
Our private helper, pivot0/2 is simple, because it knows that the 1st argument will always be bound:
pivot0( Ls , Rs ) :- % to divide a list in half and exchange the halves...
length(Ls,N0) , % get the length of the source list
N is N0 // 2 , % divide it by 2 using integer division
length(Pfx,N) , % construct a unbound list of the desired length
append(Pfx,Sfx,Ls) , % break the source list up into its two halves
append(Sfx,Pfx,Rs) % put the two halves back together in the desired order
. % Easy!
In swi-prolog:
:- use_module(library(dcg/basics)).
pivot_using_dcg3(Lst, LstPivot) :-
list_first(Lst, LstPivot, L1, L2, IsList),
phrase(piv3_up(L1), L1, L2),
% Improve determinism
(IsList = true -> ! ; true).
piv3_up(L), string(Ri), string(M), string(Le) --> piv3(L, Le, M, Ri).
piv3([], [], [], Ri) --> [], remainder(Ri).
piv3([_], [], [H], Ri) --> [H], remainder(Ri).
piv3([_, _|Lst], [H|T], M, Ri) --> [H], piv3(Lst, T, M, Ri).
% From 2 potential lists, rearrange them in order of usefulness
list_first(V1, V2, L1, L2, IsList) :-
( is_list(V1) ->
L1 = V1, L2 = V2,
IsList = true
; L1 = V2, L2 = V1,
(is_list(L1) -> IsList = true ; IsList = false)
).
Is general and deterministic, with good performance:
?- time(pivot_using_dcg3(L, P)).
% 18 inferences, 0.000 CPU in 0.000 seconds (88% CPU, 402441 Lips)
L = P, P = [] ;
% 8 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 238251 Lips)
L = P, P = [_] ;
% 10 inferences, 0.000 CPU in 0.000 seconds (87% CPU, 275073 Lips)
L = [_A,_B],
P = [_B,_A] ;
% 10 inferences, 0.000 CPU in 0.000 seconds (94% CPU, 313391 Lips)
L = [_A,_B,_C],
P = [_C,_B,_A] ;
% 12 inferences, 0.000 CPU in 0.000 seconds (87% CPU, 321940 Lips)
L = [_A,_B,_C,_D],
P = [_C,_D,_A,_B] ;
% 12 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 345752 Lips)
L = [_A,_B,_C,_D,_E],
P = [_D,_E,_C,_A,_B] ;
% 14 inferences, 0.000 CPU in 0.000 seconds (88% CPU, 371589 Lips)
L = [_A,_B,_C,_D,_E,_F],
P = [_D,_E,_F,_A,_B,_C] ;
?- numlist(1, 5000000, P), time(pivot_using_dcg3(L, P)).
% 7,500,018 inferences, 1.109 CPU in 1.098 seconds (101% CPU, 6759831 Lips)
The performance could be improved further, using difference lists for the final left-middle-right append, and cuts (sacrificing generality).
I‘m new to learn prolog, I want to fulfill the predicate below,
this is my code
onlyinteger(List,New):-
flatten(List,Fla),
member(X,Fla),
string(X),
delete(X,Fla,New).
onlyinteger([[5, 'A'], [5, 'B'],[1,'A'],[3,'C'],[7,'D']],X). -- input
what I want,
X = [5,5,1,3,7].
% Base case, reached at the end of the loop
only_integer([], []).
% Add the integer to the output list
only_integer([[Int, _Char]|Tail], [Int|LstInt]) :-
% Loop
only_integer(Tail, LstInt).
Result in swi-prolog:
?- time(only_integer([[5, 'A'], [5, 'B'],[1,'A'],[3,'C'],[7,'D']],X)).
% 6 inferences, 0.000 CPU in 0.000 seconds (82% CPU, 187137 Lips)
X = [5,5,1,3,7].
I have several dynamic facts in Prolog and I want to shuffle them (reorder in random order). Is there any way in Prolog how to do this?
:- dynamic max/3.
max(1,2,3).
max(1,5,6).
max(3,4,5).
max(2,2,5).
Possible random order:
max(2,2,5).
max(1,2,3).
max(3,4,5).
max(1,5,6).
As you mention that you're using SWI-Prolog, a possible solution is to use its nth_clause/3 and clause/3 built-in predicates. The idea is to access the predicate using a proxy predicate, ramdom_max/3 in this case. I'm also assuming that you only have facts.
:- use_module(library(lists)).
:- use_module(library(random)).
ramdom_max(A, B, C) :-
predicate_property(max(_,_,_), number_of_clauses(N)),
numlist(1, N, List),
random_permutation(List, Permutation),
member(Index, Permutation),
nth_clause(max(_,_,_), Index, Ref),
clause(max(A,B,C), _, Ref).
Sample call:
?- ramdom_max(A, B, C).
A = 1,
B = 2,
C = 3 ;
A = 3,
B = 4,
C = 5 ;
A = 1,
B = 5,
C = 6 ;
A = B, B = 2,
C = 5.
Each call to the ramdom_max/3 predicate will give you a different clause random order but still enumerating all the clauses on backtracking.
This, however, is a relatively computationally costly solution. But as max/3 is a dynamic predicate the first goals in the body of the ramdom_max /3 clause cannot be optimized to run only once. Let's check the number of inferences:
% autoload the time/1 library predicate:
?- time(true).
% 3 inferences, 0.000 CPU in 0.000 seconds (60% CPU, 333333 Lips)
true.
?- time(ramdom_max(A, B, C)).
% 42 inferences, 0.000 CPU in 0.000 seconds (85% CPU, 913043 Lips)
A = 3,
B = 4,
C = 5 ;
% 6 inferences, 0.000 CPU in 0.000 seconds (69% CPU, 272727 Lips)
A = 1,
B = 2,
C = 3 ;
% 4 inferences, 0.000 CPU in 0.000 seconds (69% CPU, 222222 Lips)
A = 1,
B = 5,
C = 6 ;
% 6 inferences, 0.000 CPU in 0.000 seconds (70% CPU, 250000 Lips)
A = B, B = 2,
C = 5.
It's worth to compare with luker's suggestion in the comments regrading using findall/3. A possible implementation is:
ramdom_max(A, B, C) :-
findall(max(A,B,C), max(A,B,C), Clauses),
random_permutation(Clauses, Permutation),
member(max(A,B,C), Permutation).
Timed call:
?- time(ramdom_max(A, B, C)).
% 40 inferences, 0.000 CPU in 0.000 seconds (78% CPU, 930233 Lips)
A = 1,
B = 5,
C = 6 ;
% 2 inferences, 0.000 CPU in 0.000 seconds (50% CPU, 200000 Lips)
A = 1,
B = 2,
C = 3 ;
% 2 inferences, 0.000 CPU in 0.000 seconds (45% CPU, 250000 Lips)
A = B, B = 2,
C = 5 ;
% 4 inferences, 0.000 CPU in 0.000 seconds (62% CPU, 250000 Lips)
A = 3,
B = 4,
C = 5.
Performance is about the same in this very limited testing. But it's also a simpler and more portable solution. Knowing a bit more about the problem you want to solve would likely allow better solutions, however.
I have a method that returns me a number on all iterations, but now I need to returns only the max value from all of the iterations that were done.
find_max(X, Y):-
find_number(X, Y).
So the find_number() returns only 1 number and some text alongside it. So for example if I were to ran it I would get this output:
X = 1, Y = me;
X = 5, Y = you;
X = 6, Y = he;
And the only output I need to return is the X = 6, Y = he;.
I am using SWI-Prolog.
A more portable alternative to the library(aggregate) posted by Willem, as the library is only available in a few Prolog systems, is:
find_max_alt(Xm, Ym) :-
setof(max(X, Y), find_number(X, Y), Solutions),
reverse(Solutions, [max(Xm, Ym)| _]).
This solution also appears to required a smaller number of inferences. Using the data in the question, we get:
?- time(find_max(Xm, Ym)).
% 40 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 800000 Lips)
Xm = 6,
Ym = he.
Versus:
?- time(find_max_alt(Xm, Ym)).
% 25 inferences, 0.000 CPU in 0.000 seconds (76% CPU, 675676 Lips)
Xm = 6,
Ym = he.
The setof/3 predicate is a standard predicate. The reverse/2 predicate is a common list predicate (and much simpler to define than the predicates in the aggregate library.
You can use the aggregate library for that:
:- use_module(library(aggregate)).
find_max(Xm, Ym):-
aggregate(max(X, Y), find_number(X, Y), max(Xm, Ym)).