fit movies on dvd, working but style/code questions

fit movies on dvd, working but style/code questions - prolog

I made a prolog program to show me the best way of fitting stuff on dvd's. The questions are in the comments of the code for reference which I'll paste below, but it boils down to this:
Is there a sort of inverted cut operator to make it search for more although it already matches? See fitexact, something like fitexact(Size,Sum,L,L):-Sum
What's the best way to keep track of already processed movies? I retract them but wonder how to do it without that.
fitfuzzy uses an if then construct. I'm not sure what to think of them, it feels weird in prolog. Trying to make it recursive left me horribly confused however :)
% given a list of movies and sizes, try to fit them all onto dvd's
% wasting as little space as possible.
% set the dvd size
dvdsize(4812).
% sum of all movies in the db
movies_size(Size) :- findall(S, movie(_,S), LS), sum_list(LS,Size).
% count of all movies in the db
movies_count(Count) :- findall(S, movie(_,S), LS), length(LS,Count).
% see which ones fit exactly
% this is where i got into trouble, the original idea was to make
% it like the fuzzy search below but i don't understand how i can
% express 'when there are no more movies which make an exact fit,
% and the sum is smaller then the dvdsize the list is ok too'.
fitexact(Movies) :- dvdsize(Size), fitexact(Size, 0, [], Movies).
% Stop when there's a perfect fit
% so here i tried Size,Sum and Sum<Size in the body. That obviously
% doesn't work since it instantly matches.
fitexact(Size, Size, Movies, Movies).
% since otherwise the same movies show up on different dvd's i
% thought it would be best to delete them after they fitted.
% if I don't want to do that, what's the best way to make sure once
% a movie is processed it won't show up again? Should it have an extra
% flag like processed(movie(name,size,processed) or should i assert
% done dvd's and see if they're already in them? I wonder how long this
% all would take since it's already quite slow as is.
%% :-
%% forall(member(Movie,Movies), retract(movie(Movie,_))). %%, !.
% Otherwise keep filling
fitexact(Size, Sum, Acc, Movies) :-
movie(Movie, MovieSize),
\+ member(Movie, Acc), % no doubles!
NewSum is Sum + MovieSize,
NewSum =< Size,
fitexact(Size, NewSum, [Movie|Acc], Movies).
removedvd(DVD) :-
forall(member(Movie,DVD),retract(movie(Movie,_))).
% do a fuzzy fit, try exact fits with decreasing size when
% there are no exact fits.
fitfuzzy(DVD) :- dvdsize(Size), fitfuzzy(DVD,Size,0).
fitfuzzy(_,Size,Size) :- movies_size(Size), !.
fitfuzzy(_,Size,Size) :- dvdsize(Size), !.
fitfuzzy(DVD,Size,Wasted) :-
CheckSize is Size - Wasted,
% this feels like a horrible way to do this. I very much like suggestions
% about how to make it recursive or better in general.
( fitexact(CheckSize, 0, [], DVD)
-> removedvd(DVD)
; NewWasted is Wasted + 1,
write('fitfuzzy: Increasing wasted space to '), write(NewWasted), nl,
fitfuzzy(DVD,Size,NewWasted)
).
status :-
movies_count(MoviesLeft),
movies_size(MoviesSize),
write('Movies left: '), write(MoviesLeft), nl,
write('Size left : '), write(MoviesSize), nl.
burnloop :-
movies_count(C), C>0,
fitfuzzy(DVD),
status,
write('DVD = '), print(DVD),nl, nl,
burnloop.
% movies.db contains a list of movie(Name,Size). statements. It also
% must have :- dynamic(movie/2). on top for retract to work.
go :-
['movies.db'],
burnloop.

Just a general comment: Instead of keeping track of processed movies, I find it much more natural to first obtain (for example, via findall/3) the list of movies that still need to be processed, and then simply work this list off. So you have burn_dvd(List0, DVD, List), which takes a list of movies (possibly in combination with their sizes, say as terms of the form movie_size(Name, Size)) as its first argument, constructs a single DVD (by selecting as many movies from List0 as fit on a single DVD, for example after sorting the list by size etc.), and the third argument is the list of remaining movies. You then have a natural extension burn_dvds(List, DVDs) which simply constructs DVDs until no more movies remain:
burn_dvds([], []) :- !.
burn_dvds(Movies0, [DVD|DVDs]) :-
burn_dvd(Movies0, DVD, Movies),
burn_dvds(Movies, DVDs).
No assert/1 or retract/1 is necessary for this. Multiple solutions are possible if burn_dvd/3 non-deterministically constructs a single DVD, which is what you may want and which also seems natural.
Using if-then-else is perfectly OK, however, everything that can be expressed by pattern matching should be expressed by pattern matching, since it usually yields more general and also more efficient code.
format/2 may help you too with output: Instead of:
write('Movies left: '), write(MoviesLeft), nl
you can write:
format("Movies left: ~w\n", [MoviesLeft])
In general, manual output is rarely needed as you can always let the toplevel print solutions for you. In our case, burn_dvds/2 naturally emits the list of DVDs as its answer when you query it.

If you want things to behave as if you kept asking for another solution after each one is provided, but gather them all up into a list, findall is what you want.
If this is all happening within a single query, you could pass around a list of used movies. For example, burn loop would take as an argument the list of movies used so far; fitfuzzy would take that list & fill in a new version with the movies for that DVD added, and you'd pass that new list to burnloop. Or, since DVD has new movies in it, write a new predicate to add the movies in DVD to the old list to make the new one.
What if you let fitexact proceed as it currently does, but also keep the list of movies that came closest to the DVD size, so that instead of failing when doesn't fill the DVD exactly, it yields that list?

A 'best practice rule' of Prolog says that assert/retract should be avoided except when it's absolutely required (i.e. when there is no a declarative approach).
Here a program using select/3 to generate all combinations
movie(a, 10).
movie(b, 3).
movie(c, 5).
movie(d, 6).
movie(e, 10).
dvdsize(20).
burn(Best) :-
findall(N-S, movie(N,S), L),
dvdsize(Max),
setof(Wasted-Sequence, fitmax(L, Max, Wasted, Sequence), All),
All = [Best|_],
maplist(writeln, All).
fitmax(L, AvailableRoom, WastedSpace, [Title|Others]) :-
select(Title - MovieSize, L, R),
MovieSize =< AvailableRoom,
RoomAfterMovie is AvailableRoom - MovieSize,
fitmax(R, RoomAfterMovie, WastedSpace, Others).
fitmax(_, WastedSpace, WastedSpace, []).
output:
?- burn(X).
0-[a,e]
0-[e,a]
1-[a,b,d]
1-[a,d,b]
1-[b,a,d]
1-[b,d,a]
1-[b,d,e]
1-[b,e,d]
1-[d,a,b]
1-[d,b,a]
1-[d,b,e]
1-[d,e,b]
1-[e,b,d]
1-[e,d,b]
2-[a,b,c]
2-[a,c,b]
2-[b,a,c]
2-[b,c,a]
2-[b,c,e]
2-[b,e,c]
2-[c,a,b]
2-[c,b,a]
2-[c,b,e]
2-[c,e,b]
2-[e,b,c]
2-[e,c,b]
4-[a,d]
4-[d,a]
4-[d,e]
4-[e,d]
5-[a,c]
5-[c,a]
5-[c,e]
5-[e,c]
6-[b,c,d]
6-[b,d,c]
6-[c,b,d]
6-[c,d,b]
6-[d,b,c]
6-[d,c,b]
7-[a,b]
7-[b,a]
7-[b,e]
7-[e,b]
9-[c,d]
9-[d,c]
10-[a]
10-[e]
11-[b,d]
11-[d,b]
12-[b,c]
12-[c,b]
14-[d]
15-[c]
17-[b]
20-[]
X = 0-[a, e].

My previous answer was 'quick and dirty', and soon shows its limits, as the number of movies grows. Here a better way to find the best fit, and a comparison with previous answer (reformulated as required by test).
The key to optimize is suggested by the tag knapsack, that rightly Axel used when posted the question. I've searched in CLP(FD) support an appropriate way of tackle the problem, here it is:
:- [library(clpfd)].
%% use CLP(FD) to find best fit
%
burn_knapsack(Best, Wasted) :-
dvdsize(Max),
findall(Title - Size, movie(Title, Size), Movies),
knaps(Movies, Max, Best, Wasted).
knaps(Movies, Max, Best, Wasted) :-
findall([Flag, Title, Size],
(Flag in 0..1, member(Title - Size, Movies)), AllMovies),
transpose(AllMovies, [ToBurn, Titles, Sizes]),
Actual #=< Max,
scalar_product(Sizes, ToBurn, #=, Actual),
labeling([max(Actual)], [Actual|ToBurn]),
findall(Title, (nth1(I, ToBurn, 1),
nth1(I, Titles, Title)), Best),
Wasted is Max - Actual.
%% compute all combinations of movies that fit on a dvd
% it's a poor man clpfd:scalar_product/4
%
burn_naive(Best, Wasted) :-
dvdsize(Max),
findall(Title - Size, movie(Title, Size), Movies),
naive(Movies, Max, Best, Wasted).
naive(Movies, Max, Best, Wasted) :-
setof(Wasted-Sequence, fitmax(Movies, Max, Wasted, Sequence), [Wasted-Best|_]).
fitmax(L, AvailableRoom, WastedSpace, [Title|Others]) :-
select(Title - MovieSize, L, R),
MovieSize =< AvailableRoom,
RoomAfterMovie is AvailableRoom - MovieSize,
fitmax(R, RoomAfterMovie, WastedSpace, Others).
fitmax(_, WastedSpace, WastedSpace, []).
%% run test with random generated list
%
% From,To are num.of.movies
% SzMin, SzMax min/max+1 of each movie size
%
test_performance(From, To, DvdSize, SzMin, SzMax) :-
forall((between(From, To, NumMovies),
gen_movies(NumMovies, SzMin, SzMax, Movies)
),
( ( NumMovies < 11
-> test(naive, Movies, DvdSize)
; true
),
test(knaps, Movies, DvdSize)
)).
test(Method, Movies, DvdSize) :-
time(once(call(Method, Movies, DvdSize, Best, Wasted))),
writeln((Method, Best, Wasted)).
gen_movies(NumMovies, SzMin, SzMax, Movies) :-
findall(Title-Size,
( between(1, NumMovies, Title),
random(SzMin, SzMax, Size)
), Movies).
I've limited the test for naive to less than 11 movies, to avoid stack overflow
?- test_performance(8,20, 30, 3,7).
% 93,155 inferences, 0,140 CPU in 0,140 seconds (100% CPU, 665697 Lips)
naive,[1,2,3,5,6],0
% 235,027 inferences, 0,159 CPU in 0,159 seconds (100% CPU, 1481504 Lips)
knaps,[2,3,5,6,8],0
% 521,369 inferences, 0,782 CPU in 0,783 seconds (100% CPU, 666694 Lips)
naive,[1,2,3,4,5,6],0
% 163,858 inferences, 0,130 CPU in 0,131 seconds (100% CPU, 1255878 Lips)
knaps,[3,4,5,6,7,9],0
% 1,607,675 inferences, 2,338 CPU in 2,341 seconds (100% CPU, 687669 Lips)
naive,[1,2,3,4,7,8],0
% 184,056 inferences, 0,179 CPU in 0,180 seconds (100% CPU, 1027411 Lips)
knaps,[5,6,7,8,9,10],0
% 227,510 inferences, 0,156 CPU in 0,156 seconds (100% CPU, 1462548 Lips)
knaps,[5,6,8,9,10,11],0
% 224,621 inferences, 0,155 CPU in 0,155 seconds (100% CPU, 1451470 Lips)
knaps,[6,7,8,9,10,11,12],0
% 227,591 inferences, 0,159 CPU in 0,159 seconds (100% CPU, 1434836 Lips)
knaps,[5,7,9,10,11,12,13],0
% 389,764 inferences, 0,263 CPU in 0,263 seconds (100% CPU, 1482017 Lips)
knaps,[5,8,9,10,12,13,14],0
% 285,944 inferences, 0,197 CPU in 0,198 seconds (100% CPU, 1448888 Lips)
knaps,[8,9,10,12,13,14,15],0
% 312,936 inferences, 0,217 CPU in 0,217 seconds (100% CPU, 1443891 Lips)
knaps,[10,11,12,14,15,16],0
% 343,612 inferences, 0,238 CPU in 0,238 seconds (100% CPU, 1445670 Lips)
knaps,[12,13,14,15,16,17],0
% 403,782 inferences, 0,277 CPU in 0,278 seconds (100% CPU, 1456345 Lips)
knaps,[11,12,13,15,16,17],0
% 433,078 inferences, 0,298 CPU in 0,298 seconds (100% CPU, 1455607 Lips)
knaps,[14,15,16,17,18,19],0
% 473,792 inferences, 0,326 CPU in 0,327 seconds (100% CPU, 1451672 Lips)
knaps,[14,15,16,17,18,19,20],0
true.

Related

What occurs-check optimizations is SWI Prolog using?

To quote the SICStus Prolog manual:
The usual mathematical theory behind Logic Programming forbids the
creation of cyclic terms, dictating that an occurs-check should be
done each time a variable is unified with a term. Unfortunately, an
occurs-check would be so expensive as to render Prolog impractical as
a programming language.
However, I ran these benchmarks (The Prolog ones) and saw fairly minor differences (less than 20%) in SWI Prolog between the occurs check (OC) being on and off:
OC is off: :- set_prolog_flag(occurs_check, false). in .swiplrc (restarted)
?- run_interleaved(10).
% 768,486,984 inferences, 91.483 CPU in 91.483 seconds (100% CPU, 8400298 Lips)
true.
?- run(1).
'Program' Time GC
================================
boyer 0.453 0.059
browse 0.395 0.000
chat_parser 0.693 0.000
crypt 0.481 0.000
fast_mu 0.628 0.000
flatten 0.584 0.000
meta_qsort 0.457 0.000
mu 0.523 0.000
nreverse 0.406 0.000
poly_10 0.512 0.000
prover 0.625 0.000
qsort 0.574 0.000
queens_8 0.473 0.000
query 0.494 0.000
reducer 0.595 0.000
sendmore 0.619 0.000
simple_analyzer 0.620 0.000
tak 0.486 0.000
zebra 0.529 0.000
average 0.534 0.003
true.
OC is on: :- set_prolog_flag(occurs_check, true). in .swiplrc (restarted)
?- run_interleaved(10).
% 853,189,814 inferences, 105.545 CPU in 105.580 seconds (100% CPU, 8083669 Lips)
true.
?- run(1).
'Program' Time GC
================================
boyer 0.572 0.060
browse 0.618 0.000
chat_parser 0.753 0.000
crypt 0.480 0.000
fast_mu 0.684 0.000
flatten 0.767 0.000
meta_qsort 0.659 0.000
mu 0.607 0.000
nreverse 0.547 0.000
poly_10 0.541 0.000
prover 0.705 0.000
qsort 0.660 0.000
queens_8 0.491 0.000
query 0.492 0.000
reducer 0.867 0.000
sendmore 0.629 0.000
simple_analyzer 0.757 0.000
tak 0.550 0.000
zebra 0.663 0.000
average 0.634 0.003
true.
Are these benchmarks not representative of real-world usage? (I remember reading somewhere that they were specifically chosen to be "fairly representative") Is SWI Prolog implementing some optimization tricks, that SICStus people aren't aware of, that make the cost of OC minor? If so, are they published? (I found a bunch of papers about this from the '90s, but I don't know if they are state-of-the-art)

The major optimization makes unification of local variables a constant operation.
Many abstract machines like the PLM, ZIP, WAM, VAM provide a special case for logic variables that cannot be subterm of some structure (called local variables). Unification with such variables does not require any occurs-check at all and thus can be constant.
In this manner large terms can be "passed back" without an extra occurs-check required.
Without this optimization, the handling of grammars (for parsing a given list) gets an overhead quadratic in the number of tokens. Every time the "input list" is handed back (so, graphically speaking, every time you are crossing a comma after a non-terminal in the grammar body), the remaining input list needs to be scanned for the occurrence of that local variable. (It's better than quadratic in the number of characters, since regular expressions are mostly encoded tail-recursively).
This optimization was introduced 2007 in 5.6.39.
It is surprising that your measurements show overheads even in cases like tak, where not a single structure is built at all. As far as I can remember, occurs-check unification in SWI 5.6.39 ran a tiny bit faster than rational tree unification (for simple cases) as (at that time) no extra setup was needed.
There is still ample room for many further occurs-check optimizations. But those will only happen, if people do use this feature. As for SWI, not much happened in the last 13 years.
But think-of-it: The very first Prolog, called Prolog 0 did support the occurs-check by default. But from Prolog I ("Marseille Prolog") on, only excuses (such as those you cite) were made. And at least, the standard did not rule out occurs-check unification as default by only defining NSTO executions and requiring unify_with_occurs_check/2 and acyclic_term/1. And now, Prologs like SWI, Tau, and Scryer provide it optionally via a flag.
A further optimization into the same direction is Joachim Beer's NEW_UNBOUND tag which avoids additionally occurs-checks of some heap-variables at the expense of a more complex scheme. See
The Occur-Check Problem Revisited. JLP 5(3) 1988. And LNCS 404.

Here is a test case where the occurs check doubles the time
to execute a query. Take this code here, to compute a negation normal
form. Since the (=)/2 is at the end of the rule, the visited compounds
becomes quadratic:
/* Variant 1 */
norm(A, R) :- var(A), !, R = pos(A).
norm((A+B), R) :- !, norm(A, C), norm(B, D), R = or(C,D).
norm((A*B), R) :- !, norm(A, C), norm(B, D), R = and(C,D).
Etc..
We can compare with this variant where the (=)/2 is done first while the compound is not yet instantiated:
/* Variant 2 */
norm(A, R) :- var(A), !, R = pos(A).
norm((A+B), R) :- !, R = or(C,D), norm(A, C), norm(B, D).
norm((A*B), R) :- !, R = and(C,D), norm(A, C), norm(B, D).
Etc..
Here are some measurements for SWI-Prolog 8.3.19. For variant 1 setting the occurs check flag to true doubles the time needed to convert some propositional formulas from the principia mathematica:
/* Variant 1 */
/* occurs_check=false */
?- time((between(1,1000,_),test,fail;true)).
% 3,646,000 inferences, 0.469 CPU in 0.468 seconds (100% CPU, 7778133 Lips)
true.
/* occurs_check=true */
?- time((between(1,1000,_),test,fail;true)).
% 6,547,000 inferences, 0.984 CPU in 0.983 seconds (100% CPU, 6650921 Lips)
true.
On the other hand, moving (=)/2 to the front changes the picture favorably:
/* Variant 2 */
/* occurs_check=false */
?- time((between(1,1000,_),test,fail;true)).
% 3,646,000 inferences, 0.453 CPU in 0.456 seconds (99% CPU, 8046345 Lips)
true.
/* occurs_check=true */
?- time((between(1,1000,_),test,fail;true)).
% 6,547,000 inferences, 0.703 CPU in 0.688 seconds (102% CPU, 9311289 Lips)
true.
Open Source (pages no longer available, nor on archive.org):
Negation Normal Form, Non-Tail Recursive
https://gist.github.com/jburse/7705ace654af0df6f4fdd12eee80aaec#file-norm-pl
Negation Normal Form, Tail Recursive
https://gist.github.com/jburse/7705ace654af0df6f4fdd12eee80aaec#file-norm2-pl
193 propositional logic test cases from Principia.
https://gist.github.com/jburse/7705ace654af0df6f4fdd12eee80aaec#file-principia-pl

Naively assuming considered harmful: Prolog predicate with accumulator blows (global) stack, but naive version does not

I have tried a few versions of a simple predicate which pulls random values out of the extra-logical universe and puts them into a list. I assumed that the version with the accumulator would be be tail-call optimized, as nothing happens after recursive call so a path to optimization exists, but it is not (it uses the "global stack"). On the other hand, the "naive version" has apparently been optimized into a loop. This is SWI Prolog.
Why would the accumulator version be impervious to tail-call optimization?
Here are the predicate versions.
Slowest, runs out of local stack space (expectedly)
Here, we just allow a head with function symbols to make things explicit.
% Slowest, and uses 4 inferences per call (+ 1 at the end of recursion).
% Uses "local stack" indicated in the "Stack limit (1.0Gb) exceeded"
% error at "Stack depth: 10,321,204":
% "Stack sizes: local: 1.0Gb, global: 7Kb, trail: 1Kb"
oracle_rands_explicit(Out,Size) :-
Size>0, !,
NewSize is Size-1,
oracle_rands_explicit(R,NewSize),
X is random_float,
Out = [X-Size|R].
oracle_rands_explicit([],0).
?- oracle_rands_explicit(X,4).
X = [0.7717053554954681-4, 0.9110187097066331-3, 0.9500246711335888-2, 0.25987829195170065-1].
?- X = 1000000, time(oracle_rands_explicit(_,X)).
% 4,000,001 inferences, 1.430 CPU in 1.459 seconds (98% CPU, 2797573 Lips)
?- X = 50000000, time(oracle_rands_explicit(_,X)).
ERROR: Stack limit (1.0Gb) exceeded
ERROR: Stack sizes: local: 1.0Gb, global: 7Kb, trail: 1Kb
ERROR: Stack depth: 10,321,204, last-call: 0%, Choice points: 6
ERROR: Possible non-terminating recursion: ...
Faster, and does not run out of stack space
Again, we just allow a head with no function symbols to make things explicit, but we move the recursive call to the end of the body, which apparently makes a difference!
% Same number of inferences as Slowest, i.e. 4 inferences per call
% (+ 1 at the end of recursion), but at HALF the time.
% Does not run out of stack space! Conclusion: this is tail-call-optimized.
oracle_rands_explicit_last_call(Out,Size) :-
Size>0, !,
NewSize is Size-1,
X is random_float,
Out = [X-Size|R],
oracle_rands_explicit_last_call(R,NewSize).
oracle_rands_explicit_last_call([],0).
?- oracle_rands_explicit_last_call(X,4).
X = [0.6450176209046125-4, 0.5605468429780708-3, 0.597052872950385-2, 0.14440970112076815-1].
?- X = 1000000, time(oracle_rands_explicit_last_call(_,X)).
% 4,000,001 inferences, 0.697 CPU in 0.702 seconds (99% CPU, 5739758 Lips)
?- X = 50000000, time(oracle_rands_explicit_last_call(_,X)).
% 200,000,001 inferences, 32.259 CPU in 32.464 seconds (99% CPU, 6199905 Lips)
Compact, less inferences, and does not run out of stack space
Here we allow function symbols in the head for more compact notation. Still naive recursion.
% Only 3 inferences per call (+ 1 at the end of recursion), but approx
% same time as "Faster".
% Does not run out of stack space! Conclusion: this is tail-call-optimized.
oracle_rands_compact([X-Size|R],Size) :-
Size>0, !,
NewSize is Size-1,
X is random_float,
oracle_rands_compact(R,NewSize).
oracle_rands_compact([],0).
?- oracle_rands_compact(X,4).
X = [0.815764980826608-4, 0.6516093608470418-3, 0.03206964297092248-2, 0.376168614426895-1].
?- X = 1000000, time(oracle_rands_compact(_,X)).
% 3,000,001 inferences, 0.641 CPU in 0.650 seconds (99% CPU, 4678064 Lips)
?- X = 50000000, time(oracle_rands_compact(_,X)).
% 150,000,001 inferences, 29.526 CPU in 29.709 seconds (99% CPU, 5080312 Lips)
Accumulator-based and unexpectedly runs out of (global) stack space
% Accumulator-based, 3 inferences per call (+ 1 at the end of recursion + 1 at ignition),
% but it is often faster than the compact version.
% Uses "global stack" as indicated in the "Stack limit (1.0Gb) exceeded"
% error at "Stack depth: 12,779,585":
% "Stack sizes: local: 1Kb, global: 0.9Gb, trail: 40.6Mb"
oracle_rands_acc(Out,Size) :- oracle_rands_acc(Size,[],Out).
oracle_rands_acc(Size,ThreadIn,ThreadOut) :-
Size>0, !,
NewSize is Size-1,
X is random_float,
oracle_rands_acc(NewSize,[X-Size|ThreadIn],ThreadOut).
oracle_rands_acc(0,ThreadIn,ThreadOut) :-
reverse(ThreadIn,ThreadOut).
?- oracle_rands_acc(X,4).
X = [0.7768407880604368-4, 0.03425412654687081-3, 0.6392634169514991-2, 0.8340458397587001-1].
?- X = 1000000, time(oracle_rands_acc(_,X)).
% 4,000,004 inferences, 0.798 CPU in 0.810 seconds (99% CPU, 5009599 Lips)
?- X = 50000000, time(oracle_rands_acc(_,X)).
ERROR: Stack limit (1.0Gb) exceeded
ERROR: Stack sizes: local: 1Kb, global: 0.9Gb, trail: 40.6Mb
ERROR: Stack depth: 12,779,585, last-call: 100%, Choice points: 6
ERROR: In:
ERROR: [12,779,585] user:oracle_rands_acc(37220431, [length:12,779,569], _876)
Addendum: Another version of the "compact" version.
Here we move the Size parameter to the first position, and do not use !. But indexing is a complex matter. Differences will probably be of note with many more clauses only.
oracle_rands_compact2(Size,[X-Size|R]) :-
Size>0,
NewSize is Size-1,
X is random_float,
oracle_rands_compact2(NewSize,R).
oracle_rands_compact2(0,[]).
Trying, with L instead of an anonymous variable, and L used after call.
X = 10000000, time(oracle_rands_compact2(X,L)),L=[].
% 30,000,002 inferences, 6.129 CPU in 6.159 seconds (100% CPU, 4894674 Lips)
X = 10000000, time(oracle_rands_compact(L,X)),L=[].
% 30,000,001 inferences, 5.865 CPU in 5.892 seconds (100% CPU, 5115153 Lips)
Maybe marginally faster. The above numbers vary a bit, one would really have to generate a full statistics over a hundred runs or so.
Does re-introducing the cut make it faster (it doesn't seem to make it slower)?
oracle_rands_compact3(Size,[X-Size|R]) :-
Size>0, !,
NewSize is Size-1,
X is random_float,
oracle_rands_compact3(NewSize,R).
oracle_rands_compact3(0,[]).
?- X = 10000000, time(oracle_rands_compact3(X,L)),L=[].
% 30,000,001 inferences, 5.026 CPU in 5.061 seconds (99% CPU, 5969441 Lips)
Can't say, really.

It all depends on the top-level shell and the actual interpretation of the _. Try
?- X = 50000000, time(oracle_rands_compact(L,X)),L=[].
instead, it will be more or less as bad as the accumulator version which has to produce the entire list first only to hand it over to reverse/2. To see this use
?- set_prolog_flag(trace_gc, true).
true.
?- X = 50000000, time(oracle_rands_compact(_,X)).
% GC: gained 0+0 in 0.001 sec; used 440+8; free 126,520+129,008
% GC: gained 0+0 in 0.000 sec; used 464+16; free 126,496+129,000
% GC: gained 0+0 in 0.000 sec; used 464+16; free 126,496+129,000
...
?- X = 50000000, time(oracle_rands_compact(L,X)),L=[].
% SHIFT: l:g:t = 0:1:0 ...l+g+t = 131072+262144+131072 (0.000 sec)
% GC: gained 0+0 in 0.002 sec; used 123,024+16; free 135,008+129,000
% SHIFT: l:g:t = 0:1:0 ...l+g+t = 131072+524288+131072 (0.000 sec)
% GC: gained 0+0 in 0.003 sec; used 257,976+24; free 262,200+128,992
% SHIFT: l:g:t = 0:0:1 ...l+g+t = 131072+524288+262144 (0.000 sec)
% SHIFT: l:g:t = 0:1:0 ...l+g+t = 131072+1048576+262144 (0.000 sec)
% GC: gained 0+0 in 0.007 sec; used 520,104+16; free 524,360+260,072
...
If we are at it, your _compact version can be accelerated by exchanging the arguments and removing the cut. Classic first argument indexing is capable of handling this situation avoiding any choice point. (SWI has WAM style 1st argument indexing plus a lesser version for multiple arguments, last time I checked)

Assert Intermediate Result in Prolog

This is the question.
Define a predicate sigma(N,S) such that S = 1+2+...+N. And remember every new intermediate result in the query. For example, after query sigma(3,S), it will store some thing like sigma(2,3),sigma(3,6) to database such that we needn't do duplicate and useless work later.
I tried the following method to solve it.
sigmares(1,1).
mysigma(N,A,Sum) :-
sigmares(N,SN),
Sum is SN+A,
!.
mysigma(N1,Acc,Sum) :-
N is N1-1,
A is Acc + N1,
mysigma(N,A,Sum),
assertz(sigmares(N1,Sum)). % <<<<<<<<<< This line doesn't work properly.
sigma(N,X) :-
mysigma(N,0,X).
There is some problem with assertz line. Since sum can be only initialized once which is the value of sum from 1 to N, sigma(2,6),sigma(3,6) for query sigma(3,S) will be inserted. Is there any other way to store new intermediate sigmares?

First, it's good coding style to always declare the dynamic predicates that your code uses using the standard dynamic/1 directive. Simply add at the beginning of the file:
:- dynamic(sigmares/2).
An interesting aspect of your definition of the mysigma/3 predicate is that it is a non tail-recursive with the consequence that it requires space linear on its inputs. But that allows it to cache all intermediate results as you intend. A fixed version of your code will be:
:- dynamic(sigma_cache/2).
sigma_cache(1, 1).
sigma(N, S) :-
sigma_cache(N, S),
!.
sigma(N, S) :-
N > 1,
M is N - 1,
sigma(M, SM),
S is SM + N,
assertz(sigma_cache(N, S)).
Sample call:
?- sigma(5, S).
S = 15.
?- listing(sigma_cache/2).
:- dynamic sigma_cache/2.
sigma_cache(1, 1).
sigma_cache(2, 3).
sigma_cache(3, 6).
sigma_cache(4, 10).
sigma_cache(5, 15).
true.

This alternative answer provides a solution based on the tabling mechanism found in some Prolog systems, including B-Prolog, Ciao, SWI-Prolog, XSB, and YAP:
:- table(sigma/2).
sigma(1, 1).
sigma(N, S) :-
N > 1,
M is N - 1,
sigma(M, SM),
S is SM + N.
Let's test it with the help of SWI-Prolog handy time/1 library predicate that reports the time and number of inferences taken to prove a goal:
?- time(sigma(5, S)).
% 166 inferences, 0.000 CPU in 0.006 seconds (2% CPU, 1238806 Lips)
S = 15.
?- time(sigma(5, S)).
% 5 inferences, 0.000 CPU in 0.000 seconds (68% CPU, 208333 Lips)
S = 15.
Note that I used a non tail-recursive definition for the sigma/2 predicate on purpose so that all intermediate results are cached (as per the requirements in your question). For example:
?- time(sigma(4, S)).
% 5 inferences, 0.000 CPU in 0.000 seconds (70% CPU, 217391 Lips)
S = 10.
You can see that, after the first call, the result is cached by the tabling mechanism, resulting in a much lower number of inferences when we repeat the query.
?- time(sigma(6, S)).
% 32 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 727273 Lips)
S = 21.
?- time(sigma(6, S)).
% 5 inferences, 0.000 CPU in 0.000 seconds (70% CPU, 217391 Lips)
S = 21.
Note again the number of inferences. The first query reuses the cached result for sigma(5, S) and caches the result for sigma(6, S), making the repeated query again faster as it just reuses the cached result.

STRIPS Planner loops indefinitely

I defined in Prolog a STRIPS Planner to solve logic problems. After a few tryouts with other simpler problems I set out to see if it could solve a more complex one. I gave him a STRIPS definition of the peg solitaire, the english version and considering we cant do diagonal moves and the last ball will end up in the center of the board and tried it, to which the program broke into a loop. Here's the problem: https://en.wikipedia.org/wiki/Peg_solitaire
Here's my solution:
%%%%%%%%%%%%%%%%%%%%%% PLAN %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
accao(nome : move(Xi,Yi,Xf,Yf),
condicoes : [empty(Xf,Yf),ball(Xi,Yi), ball(Xm,Ym)],
efeitos : [ball(Xf,Yf), -ball(Xm,Ym),-ball(Xi,Yi), empty(Xi,Yi), empty(Xm,Ym), -empty(Xf,Yf)],
restricoes : [abs(Xf-Xi)+abs(Yf-Yi)=:=2, abs(Xf-Xi)*abs(Yf-Yi)=:=0, Xi=<Xm, Xm=<Xf, Yi=<Ym, Ym=<Yf]).
inicial([empty(5,5), ball(1,4), ball(1,5), ball(1,6),
ball(2,4), ball(2,5), ball(2,6),
ball(3,4), ball(3,5), ball(3,6),
ball(4,1), ball(4,2), ball(4,3),ball(4,4), ball(4,5), ball(4,6),ball(4,7), ball(4,8), ball(4,9),
ball(5,1), ball(5,2), ball(5,3),ball(5,4), ball(5,6),ball(5,7), ball(5,8), ball(5,9),
ball(6,1), ball(6,2), ball(6,3),ball(6,4), ball(6,5), ball(6,6),ball(6,7), ball(6,8), ball(6,9),
ball(7,4), ball(7,5), ball(7,6),
ball(8,4), ball(8,5), ball(8,6),
ball(9,4), ball(9,5), ball(9,6)]).
objectivos([ball(5,5), empty(1,4), empty(1,5), empty(1,6),
empty(2,4), empty(2,5), empty(2,6),
empty(3,4), empty(3,5), empty(3,6),
empty(4,1), empty(4,2), empty(4,3),empty(4,4), empty(4,5), empty(4,6),empty(4,7), empty(4,8), empty(4,9),
empty(5,1), empty(5,2), empty(5,3),empty(5,4), empty(5,6),empty(5,7), empty(5,8), empty(5,9),
empty(6,1), empty(6,2), empty(6,3),empty(6,4), empty(6,5), empty(6,6),empty(6,7), empty(6,8), empty(6,9),
empty(7,4), empty(7,5), empty(7,6),
empty(8,4), empty(8,5), empty(8,6),
empty(9,4), empty(9,5), empty(9,6)]).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%% PRINT FUNCTION %%%%%%%%%%%%%%%%%%%%%%%%%%%
printExec([]).
printExec([A,E|T]) :- write("Action performed: "),
write(A),nl,
write("Situation: "),
write(E),nl,
printExec(T).
writeExec([I|T]):- write("Initial Situation"),
write(I),nl,
printExec(T),
write("Goal: "),
objectivos(G),
write(G),
write(" satisfied."),nl.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%% AUXILIAR FUNCTIONS %%%%%%%%%%%%%%%%%%%%%%%%%
member(E,[E|_]).
member(E,[_|T]):-member(E,T).
sub([],_).
sub([H|T],L):- member(H,L),
sub(T,L).
remove(_,[],[]):-!.
remove(E1, [E2|T], T):- E1 == E2, !.
remove(E,[H|T1],[H|T2]):- remove(E,T1,T2).
add(E,[],[E]):-!.
add(E1,[E2|T],[E1,E2|T]):- E1 \== E2, !.
add(E,[H|T1],[H|T2]):-add(E,T1,T2).
effects([],S,S).
effects([-H|Fx],S,N) :-!,
remove(H,S,NS),
effects(Fx,NS,N).
effects([H|Fx],S,N) :- !,
add(H,S,NS),
effects(Fx,NS,N).
restriction([]).
restriction([R|T]) :- R,
restriction(T).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%% PLAN EXECUTE %%%%%%%%%%%%%%%%%%%%%%%%%%%
planExecute(P):-testPlan(P,E),writeExec(E),!.
satisfiedGoal(E):- objectivos(Fn),!,
sub(Fn,E).
testPlan(Plan,[I|Exec]) :- inicial(I),
testPlan(Plan,I,Exec,Fn),
satisfiedGoal(Fn).
testPlan([],Fn,[],Fn).
testPlan([H|T],S,[H,N|Exec],Fn) :- accao(nome:H, condicoes:C,efeitos:E, restricoes:R),
sub(C,S),
effects(E,S,N),
restriction(R),
testPlan(T,N,Exec,Fn).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%% FIND PLAN %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
plano(P) :- progressivePlan(P, 0).
progressivePlan(P, N) :- createPlan(P,_,0,N).
progressivePlan(P, N) :- \+ createPlan(P,_,0,N),
NewN is N + 1,
progressivePlan(P, NewN).
createPlan(Plan,[I|Exec],N,Max) :- inicial(I),
createPlan(Plan,I,Exec,Fn,N,Max),
satisfiedGoal(Fn).
createPlan([],Fn,[],Fn,Max,Max):- !.
createPlan([H|T],S,[H,N|Exec],Fn,Acc, Max) :- accao(nome:H, condicoes:C, efeitos:E, restricoes:R),
sub(C,S),
effects(E,S,N),
restriction(R),
NewAcc is Acc+1,
createPlan(T,N,Exec,Fn,NewAcc, Max).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%`
I've tried simplifying the goal by just doing one or two moves, which works for the one and when the two moves don't contradict each other, like moving a marble on through one that was already moved, entering the loop with two moves when they do, like with said objective:
objectivos([ball(4,5), empty(3,5), empty(5,5), empty(6,5)]).
I've tried tracing and debugging but I cant seem to find the issue, although I believe it to be located in the formulation of the problem as opposed to the Planner itself. Any Ideas?

There is at least one logical error in your code, and some simple performance tweaks are possible. This gives a partial solution to your problem.
First, for the logical error: The intended solution for the goal objectivos([ball(4,5), empty(3,5), empty(5,5), empty(6,5)]) seems to be the plan P = [move(3, 5, 5, 5), move(6, 5, 4, 5)]. But the second of these moves is not legal with your definition of restricoes: For this move you have Xi = 6, Xf = 4, and conditions requiring that 6 =< Xm and Xm <= 4, but this is impossible. The idea of these constraints is to ensure that ball(Xm,Ym) is between the other two balls in the move. Here is an alternative formulation that ensures this:
restricoes : [abs(Xf-Xi)+abs(Yf-Yi) =:= 2,
abs(Xf-Xi)*abs(Yf-Yi) =:= 0,
abs(Xf-Xm)+abs(Yf-Ym) =:= 1,
abs(Xi-Xm)+abs(Yi-Ym) =:= 1]
This also excludes a case that confused me before, when tracing the code: Previously it was legal to have ball(Xi,Yi) = ball(Xm,Ym).
Second, to improve performance, exchange the goals effects(E,S,N) and restriction(R) in the definition of createPlan/6. Previously you computed the effects of moves before checking their legality! Because most moves proposed by the planner are illegal, this wastes a lot of time.
Then, to make the whole thing nicer to use, you can change the definitions of plano/1 and createPlan/4 to:
plano(P) :-
length(P, PlanLength),
createPlan(P, _, 0, PlanLength).
createPlan(Plan,[I|Exec],N,Max) :- inicial(I),
N =< Max,
createPlan(Plan,I,Exec,Fn,N,Max),
satisfiedGoal(Fn).
This is simpler than the definition you had before, and it also behaves more nicely. We can pass in a complete plan to check whether it is legal, or just pass in a list of fixed length to ask what plans of that length exist:
?- P = [_,_], plano(P).
P = [move(3, 5, 5, 5), move(6, 5, 4, 5)] ;
false. % no more solutions
With your definition, this would go on looping and counting up the Max counter, searching for further solutions that cannot exist.
With this formulation we can switch to your big goal and try to search for a solution (this is partly specific to SWI-Prolog):
?- length(P, N), format('searching for solutions of length ~w~n', [N]), time(plano(P)).
searching for solutions of length 0
% 58 inferences, 0.000 CPU in 0.000 seconds (71% CPU, 2171959 Lips)
searching for solutions of length 1
% 9,709 inferences, 0.001 CPU in 0.001 seconds (100% CPU, 9123980 Lips)
searching for solutions of length 2
% 79,789 inferences, 0.009 CPU in 0.009 seconds (100% CPU, 8778416 Lips)
searching for solutions of length 3
% 477,230 inferences, 0.051 CPU in 0.051 seconds (100% CPU, 9409315 Lips)
searching for solutions of length 4
% 3,412,088 inferences, 0.361 CPU in 0.361 seconds (100% CPU, 9453315 Lips)
searching for solutions of length 5
% 30,967,699 inferences, 3.503 CPU in 3.503 seconds (100% CPU, 8840598 Lips)
searching for solutions of length 6
I had to interrupt the search at this point, it becomes too slow. More tweaks are certainly possible, and I might keep looking at this.

Implementation of `last/2` in terms of `append/3` or `reverse/2`

Why does the SWI-Prolog documentation suggest append(_, [Last], List) as a portable last/2 instead of say reverse(List, [Last|_]) (see here)? Is it that reverse/2 itself is not as widely implemented as append/3 is? Or is there something else I am missing from the picture?
Either way, all three do not terminate if the list is cyclic:
?- L = [z|L], last(L, Last).
^CAction (h for help) ? abort
% Execution Aborted
?- L = [z|L], append(_, [Last], L).
^CAction (h for help) ? abort
% Execution Aborted
?- L = [z|L], reverse(L, [Last|_]).
^CAction (h for help) ? abort
% Execution Aborted
But still, reverse/2 at least does not leave a choice point behind on proper lists:
?- append(_, [Last], [a]).
Last = a ;
false.
?- reverse([a], [Last|_]).
Last = a.

The definition of reverse/2 is in fact less common, and also SWI's implementation has better termination behavior whereas many other implementations only terminate if the first argument is a list. I see at least 3 different implementations: SWI on the one hand, SICStus and many other on the other, and then XSB which is a bit in the middle of both. You can distinguish them with the following goals:
reverse(Xs, [a]). % terminates for SWI and XSB
reverse([a|Xs], [a]). % terminates for SWI
Performance-wise, I would expect that the traditional reverse/2 (not SWI's implementation) should be faster by a bit because it runs entirely deterministic. On the other hand, it recreates the entire list on the heap.
In current implementations, append(_, [L], Xs) is not ideally implemented: For each element of the list Xs, a choice point is created and then removed, leaving the last choice point active.
For more, see this question.

Actually, the system behaves exactly the inverse of what I was expecting:
13 ?- length(L,1000000),time(reverse(L,[X|_])),statistics.
% 1,000,002 inferences, 0.422 CPU in 0.424 seconds (100% CPU, 2367605 Lips)
% Started at Tue Mar 10 14:31:08 2015
% 523.563 seconds cpu time for 7,770,847 inferences
% 13,661 atoms, 4,540 functors, 3,987 predicates, 81 modules, 214,610 VM-codes
%
% Limit Allocated In use
% Local stack: 268,435,456 12,288 1,904 Bytes
% Global stack: 268,435,456 100,659,184 72,011,904 Bytes
% Trail stack: 268,435,456 129,016 2,280 Bytes
%
% 8 garbage collections gained 1,837,408 bytes in 1.346 seconds.
% Stack shifts: 13 local, 68 global, 47 trail in 0.034 seconds
% 2 threads, 0 finished threads used 0.000 seconds
L = [_G1238, _G1241, _G1244, _G1247, _G1250, _G1253, _G1256, _G1259, _G1262|...].
14 ?- length(L,1000000),time(append(_,[X],L)),statistics.
% 999,999 inferences, 0.572 CPU in 0.574 seconds (100% CPU, 1747727 Lips)
% Started at Tue Mar 10 14:31:08 2015
% 536.544 seconds cpu time for 8,772,339 inferences
% 13,662 atoms, 4,540 functors, 3,987 predicates, 81 modules, 214,615 VM-codes
%
% Limit Allocated In use
% Local stack: 268,435,456 12,288 2,960 Bytes
% Global stack: 268,435,456 50,327,536 48,011,920 Bytes
% Trail stack: 268,435,456 30,712 2,312 Bytes
%
% 8 garbage collections gained 1,837,408 bytes in 1.346 seconds.
% Stack shifts: 13 local, 72 global, 50 trail in 0.036 seconds
% 2 threads, 0 finished threads used 0.000 seconds
L = [_G1240, _G1243, _G1246, _G1249, _G1252, _G1255, _G1258, _G1261, _G1264|...]
.
Seems that reverse/2 is using 2 times the allocation of append/3. The global and trail stacks usage is double for reverse/2, a consequence of reverse/2 being cleverly compiled to reverse/4....

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio