Converting grammar into prolog - prolog

So I am trying to convert a grammar that defines variable definitions in a programming language. This is my first every prolog, and its very different from typical languages so I am confused. The grammar goes as follows:
S -> T S | T
T -> char F semicolon | int F semicolon
F -> id | id G
G -> comma F
So effectively it would return true for things like "char id semicolon" or "int id comma id semicolon char id semicolon".
I am trying to turn this into a prolog program to recognize this grammar. What I have so far is:
type([char|T],T).
type([int|T],T).
def([id|T], T).
com([comma|T], T).
semi([semicolon|T], T).
vardef(L,S) :-
type(L,S1),
def(S1,S2),
comma(S2,S3),
def(S3,S4),
semi(S4,S).
variable_definition(L) :-
vardef(L,[]).
However, this obviously only recognizes something that specifically "int/char id comma id semicolon". I don't know how to make it so it has a variable number of "id comma id" before a semicolon, or even have a full new variable definition after the first one. Other questions on this site about the same thing typically have to deal with grammars that are set in place like this, not ones that can have a variable amount of inputs.
EDIT: So the question is two-fold. First, how do I make it so it recognizes two different variable definitions, one right after the other. I assume I have to change the last line in order to complete this, but I am unsure how.
Second, how do I make it recognize a variable amount of "id"s followed by commas. So if I want it to recognize "char id semicolon" as well as "char id comma id semicolon".

The most natural way to express a grammar like this in Prolog is using Prolog's DCG notation:
S -> T S | T
T -> char F semicolon | int F semicolon
F -> id | id G
G -> comma F
s --> t, s | t.
t --> [char], f, [semicolon] | [int], f, [semicolon].
f --> [id] | [id], g.
g --> [comma], f.
The nice thing about DCG is that it expresses the notation more directly. You can then use phrase/2 to run it:
| ?- phrase(s, [char, id, semicolon]).
true ? ;
no
You can with this grammar, to some extent, generate valid phrases:
| ?- phrase(t, S).
S = [char,id,semicolon] ? ;
S = [char,id,comma,id,semicolon] ? ;
S = [char,id,comma,id,comma,id,semicolon] ? ;
...
However...
| ?- phrase(s, S).
Fatal Error: local stack overflow (size: 16384 Kb, reached: 16384 Kb,
environment variable used: LOCALSZ)
The word s is defined in such a way that it doesn't terminate. We can fix this by moving the recursive case later:
s --> t | t, s.
Then:
| ?- phrase(s, S).
S = [char,id,semicolon] ? ;
S = [char,id,comma,id,semicolon] ? ;
S = [char,id,comma,id,comma,id,semicolon] ? ;
...
You can see how this is implemented in standard notation by listing the Prolog code for the predicate:
| ?- listing(t).
% file: user
t(A, B) :-
( A = [char|C],
f(C, D),
D = [semicolon|B]
; A = [int|E],
f(E, F),
F = [semicolon|B]
).
yes
| ?-
You could write this more succinctly as:
t([char|T], B) :-
f(T, [semicolon|B]).
t([int|T], B) :-
f(T, [semicolon|B]).
Which would be called as t(L, []) (the equivalent result as phrase(t, L)).
If we list the rest of the predicates, you can get a complete solution in the form you are asking for:
| ?- listing.
s(A, B) :-
( t(A, B)
; t(A, C),
s(C, B)
).
t(A, B) :-
( A = [char|C],
f(C, D),
D = [semicolon|B]
; A = [int|E],
f(E, F),
F = [semicolon|B]
).
f(A, B) :-
( A = [id|B]
; A = [id|C],
g(C, B)
).
g([comma|A], B) :-
f(A, B).
Refactoring slightly (making it less verbose):
s(L, S) :-
t(L, S).
s(L, S) :-
t(L, S1),
s(S1, S).
t([char|T], S) :-
f(T, [semicolon|S]).
t([int|T], S) :-
f(T, [semicolon|S]).
f([id|S], S).
f([id|S1], S) :-
g(S1, S).
g([comma|S1], S) :-
f(S1, S).
And from here you can call: variable_definition(D) :- s(D, []).

Related

Prolog list of predicates to list of lists

I have a list like: [a([x,y]), b([u,v])] and I want my result as [[x,y], [u,v]].
Here is my code:
p(L, Res) :-
findall(X, (member(a(X), L)), A1), append([A1],[],L1),
findall(Y, (member(b(Y), L)), A2), append(L1,[A2],L2),
append(L2, Res).
This provides a partially good result but if my list is [a([x,y]), c([u,v])], I would like the result to be: [[x,y],[]] and it is [[x,y]].
More examples:
p([b([u,v]), a([x,y]), c([s,t]), d([e,f])], R)
The result I get: [[x,y],[u,v]] (as expected).
p([b([u,v]), z([x,y]), c([s,t]), d([e,f])], R)
The result I get: [[u,v]]'.
The result I want: [[],[u,v]].
EDIT: Added more examples.
Now that it's clear what the problem statement really is, the solution is a little more understood. Your current solution is a little bit overdone and can be simplified. Also, the case where you want to have a [] element when the term isn't found falls a little outside of the paradigm, so can be handled as an exception. #AnsPiter has the right idea about using =../2, particularly if you need a solution that handles multiple occurrences of a and/or b in the list.
p(L, Res) :-
find_term(a, L, As), % Find the a terms
find_term(b, L, Bs), % Find the b terms
append(As, Bs, Res). % Append the results
find_term(F, L, Terms) :-
Term =.. [F, X],
findall(X, member(Term, L), Ts),
( Ts = [] % No results?
-> Terms = [[]] % yes, then list is single element, []
; Terms = Ts % no, then result is the list of terms
).
Usage:
| ?- p([b([u,v]), z([x,y]), c([s,t]), d([e,f])], R).
R = [[],[u,v]]
yes
| ?- p([b([x,y]), a([u,v])], L).
L = [[u,v],[x,y]]
yes
| ?-
The above solution will handle multiple occurrences of a and b.
If the problem really is restricted to one occurrence of each, then findall/3 and append/3 are way overkill and the predicate can be written:
p(L, [A,B]) :-
( member(a(A), L)
-> true
; A = []
),
( member(b(B), L)
-> true
; B = []
).
Term =.. List : Unifies List with a list whose head is the atom corresponding to the principal functor of
Term and whose tail is a list of the arguments of Term.
Example :
| ?- foo(n,n+1,n+2)=..List.
List = [foo,n,n+1,n+2] ?
| ?- Term=..[foo,n,n+1,n+2].
Term = foo(n,n+1,n+2)
rely on your suggestion; you have a term contains a single argument List
so ;
p([],[]).
p([X|Xs], Result) :-
X=..[F,Y],
(%IF
\+(F='c')-> % not(F=c)
Result=[Y|Res];
%ELSE
Result = Res % Result = [Res] ==> [[x,y],[]]
),
p(Xs,Res).
Test :
| ?- p([a([x,y]), c([u,v])],R).
R = [[x,y]] ?
yes
| ?- p([a([x,y]), b([u,v])],R).
R = [[x,y],[u,v]] ?
yes

Split a list in separate lists

I have to define some more constraints for my list.
I want to split my list is separate lists.
Example:
List=[[1,1],[_,0],[_,0],[_,0],[3,1],[_,0],[9,1],[2,0],[4,0]]
I need three Lists which i get from the main list:
[[_,0],[_,0],[_,0]] and [[_,0]] and [[2,0],[4,0]]
SO I always need a group of lists between a term with [X,1].
It would be great if u could give me a tip. Don’t want the solution, only a tip how to solve this.
Jörg
This implementation tries to preserve logical-purity without restricting the list items to be [_,_], like
#false's answer does.
I can see that imposing above restriction does make a lot of sense... still I would like to lift it---and attack the more general problem.
The following is based on if_/3, splitlistIf/3 and reified predicate, marker_truth/2.
marker_truth(M,T) reifies the "marker"-ness of M into the truth value T (true or false).
is_marker([_,1]). % non-reified
marker_truth([_,1],true). % reified: variant #1
marker_truth(Xs,false) :-
dif(Xs,[_,1]).
Easy enough! Let's try splitlistIf/3 and marker_truth/2 together in a query:
?- Ls=[[1,1],[_,0],[_,0],[_,0],[3,1],[_,0],[9,1],[2,0],[4,0]],
splitlistIf(marker_truth,Ls,Pss).
Ls = [[1,1],[_A,0],[_B,0],[_C,0],[3,1],[_D,0],[9,1],[2,0],[4,0]],
Pss = [ [[_A,0],[_B,0],[_C,0]], [[_D,0]], [[2,0],[4,0]]] ? ; % OK
Ls = [[1,1],[_A,0],[_B,0],[_C,0],[3,1],[_D,0],[9,1],[2,0],[4,0]],
Pss = [ [[_A,0],[_B,0],[_C,0]], [[_D,0],[9,1],[2,0],[4,0]]],
prolog:dif([9,1],[_E,1]) ? ; % BAD
%% query aborted (6 other BAD answers omitted)
D'oh!
The second answer shown above is certainly not what we wanted.
Clearly, splitlistIf/3 should have split Ls at that point,
as the goal is_marker([9,1]) succeeds. It didn't. Instead, we got an answer with a frozen dif/2 goal that will never be woken up, because it is waiting for the instantiation of the anonymous variable _E.
Guess who's to blame! The second clause of marker_truth/2:
marker_truth(Xs,false) :- dif(Xs,[_,1]). % BAD
What can we do about it? Use our own inequality predicate that doesn't freeze on a variable which will never be instantiated:
marker_truth(Xs,Truth) :- % variant #2
freeze(Xs, marker_truth__1(Xs,Truth)).
marker_truth__1(Xs,Truth) :-
( Xs = [_|Xs0]
-> freeze(Xs0, marker_truth__2(Xs0,Truth))
; Truth = false
).
marker_truth__2(Xs,Truth) :-
( Xs = [X|Xs0]
-> when((nonvar(X);nonvar(Xs0)), marker_truth__3(X,Xs0,Truth))
; Truth = false
).
marker_truth__3(X,Xs0,Truth) :- % X or Xs0 have become nonvar
( nonvar(X)
-> ( X == 1
-> freeze(Xs0,(Xs0 == [] -> Truth = true ; Truth = false))
; Truth = false
)
; Xs0 == []
-> freeze(X,(X == 1 -> Truth = true ; Truth = false))
; Truth = false
).
All this code, for expressing the safe logical negation of is_marker([_,1])? UGLY!
Let's see if it (at least) helped above query (the one which gave so many useless answers)!
?- Ls=[[1,1],[_,0],[_,0],[_,0],[3,1],[_,0],[9,1],[2,0],[4,0]],
splitlistIf(marker_truth,Ls,Pss).
Ls = [[1,1],[_A,0],[_B,0],[_C,0],[3,1],[_D,0],[9,1],[2,0],[4,0]],
Pss = [[ [_A,0],[_B,0],[_C,0]], [[_D,0]], [[2,0],[4,0]]] ? ;
no
It works! When considering the coding effort required, however, it is clear that either a code generation scheme or a
variant of dif/2 (which shows above behaviour) will have to be devised.
Edit 2015-05-25
Above implementation marker_truth/2 somewhat works, but leaves a lot to be desired. Consider:
?- marker_truth(M,Truth). % most general use
freeze(M, marker_truth__1(M, Truth)).
This answer is not what we would like to get. To see why not, let's look at the answers of a comparable use of integer_truth/2:
?- integer_truth(I,Truth). % most general use
Truth = true, freeze(I, integer(I)) ;
Truth = false, freeze(I, \+integer(I)).
Two answers in the most general case---that's how a reified predicate should behave like!
Let's recode marker_truth/2 accordingly:
marker_truth(Xs,Truth) :- subsumes_term([_,1],Xs), !, Truth = true.
marker_truth(Xs,Truth) :- Xs \= [_,1], !, Truth = false.
marker_truth([_,1],true).
marker_truth(Xs ,false) :- nonMarker__1(Xs).
nonMarker__1(T) :- var(T), !, freeze(T,nonMarker__1(T)).
nonMarker__1(T) :- T = [_|Arg], !, nonMarker__2(Arg).
nonMarker__1(_).
nonMarker__2(T) :- var(T), !, freeze(T,nonMarker__2(T)).
nonMarker__2(T) :- T = [_|_], !, dif(T,[1]).
nonMarker__2(_).
Let's re-run above query with the new implementation of marker_truth/2:
?- marker_truth(M,Truth). % most general use
Truth = true, M = [_A,1] ;
Truth = false, freeze(M, nonMarker__1(M)).
It is not clear what you mean by a "group of lists". In your example you start with [1,1] which fits your criterion of [_,1]. So shouldn't there be an empty list in the beginning? Or maybe you meant that it all starts with such a marker?
And what if there are further markers around?
First you need to define the criterion for a marker element. This for both cases: When it applies and when it does not apply and thus this is an element in between.
marker([_,1]).
nonmarker([_,C]) :-
dif(1, C).
Note that with these predicates we imply that every element has to be [_,_]. You did not state it, but it does make sense.
split(Xs, As, Bs, Cs) :-
phrase(three_seqs(As, Bs, Cs), Xs).
marker -->
[E],
{marker(E)}.
three_seqs(As, Bs, Cs) -->
marker,
all_seq(nonmarker, As),
marker,
all_seq(nonmarker, Bs),
marker,
all_seq(nonmarker, Cs).
For a definition of all_seq//2 see this
In place of marker, one could write all_seq(marker,[_])
You can use a predicate like append/3. For example, to split a list on the first occurence of the atom x in it, you would say:
?- L = [a,b,c,d,x,e,f,g,x,h,i,j], once(append(Before, [x|After], L)).
L = [a, b, c, d, x, e, f, g, x|...],
Before = [a, b, c, d],
After = [e, f, g, x, h, i, j].
As #false has pointed out, putting an extra requirement might change your result, but this is what is nice about using append/3:
"Split the list on x so that the second part starts with h:
?- L = [a,b,c,d,x,e,f,g,x,h,i,j], After = [h|_], append(Before, [x|After], L).
L = [a, b, c, d, x, e, f, g, x|...],
After = [h, i, j],
Before = [a, b, c, d, x, e, f, g].
This is just the tip.

Coroutining in Prolog: when argument is a list (it has fixed length)

Question
Is it possible to schedule a goal to be executed as soon as the length of a list is known / fixed or, as #false pointed out in the comments, a given argument becomes a [proper] list? Something along this line:
when(fixed_length(L), ... some goal ...).
When-conditions can be constructed using ?=/2, nonvar/1, ground/1, ,/2, and ;/2 only and it seems they are not very useful when looking at the whole list.
As a further detail, I'm looking for a solution that presents logical-purity if that is possible.
Motivation
I think this condition might be useful when one wants to use a predicate p(L) to check a property for a list L, but without using it in a generative way.
E.g. it might be the case that [for efficiency or termination reasons] one prefers to execute the following conjunction p1(L), p2(L) in this order if L has a fixed length (i.e. L is a list), and in reversed order p2(L), p1(L) otherwise (if L is a partial list).
This might be achieved like this:
when(fixed_length(L), p1(L)), p2(L).
Update
I did implement a solution, but it lacks purity.
It would be nice if when/2 would support a condition list/1. In the meantime, consider:
list_ltruth(L, Bool) :-
freeze(L, nvlist_ltruth(L, Bool)).
nvlist_ltruth(Xs0, Bool) :-
( Xs0 == [] -> Bool = true
; Xs0 = [_|Xs1] -> freeze(Xs1, nvist_ltruth(Xs1, Bool))
; Bool = false
).
when_list(L, Goal_0) :-
nvlist_ltruth(L, Bool),
when(nonvar(Bool),( Bool == true, Goal_0 )).
So you can combine this also with other conditions.
Maybe produce a type error, if L is not a list.
when(nonvar(Bool), ( Bool == true -> Goal_0 ; sort([], L) ).
Above trick will only work in an ISO conforming Prolog system like SICStus or GNU that produces a type_error(list,[a|nonlist]) for sort([],[a|nonlist]), otherwise replace it by:
when(nonvar(Bool),
( Bool == true -> Goal_0 ; throw(error(type_error(list,L), _)).
Many systems contain some implementation specific built-in like '$skip_list' to traverse lists rapidly, you might want to use it here.
I've managed to answer my own question, but not with a pure solution.
Some observations
The difficulty encountered in writing a program that schedules some goal for execution when the length of a list is precisely known is the fact that the actual condition might change. Consider this:
when(fixed_length(L), Goal)
The length of the list might change if L is unbound or if the last tail is unbound. Say we have this argument L = [_,_|Tail]. L has a fixed width only if Tail has a fixed width (in other words, L is a list if T is a list). So, a condition that checks Tail might be the only thing to do at first. But if Tail becomes [a|Tail2] a new when-condition that tests if Tail2 is a list is needed.
The solution
1. Getting the when-condition
I've implemented a predicate that relates a partial list with the when-condition that signals when it might become a list (i.e. nonvar(T) where T is the deepest tail).
condition_fixed_length(List, Cond):-
\+ (List = []),
\+ \+ (List = [_|_]),
List = [_|Tail],
condition_fixed_length(Tail, Cond).
condition_fixed_length(List, Cond):-
\+ \+ (List = []),
\+ \+ (List = [_|_]),
Cond = nonvar(List).
2. Recursively when-conditioning
check_on_fixed_length(List, Goal):-
(
condition_fixed_length(List, Condition)
->
when(Condition, check_on_fixed_length(List, Goal))
;
call(Goal)
).
Example queries
Suppose we want to check that all elements of L are a when the size of L is fixed:
?- check_on_fixed_length(L, maplist(=(a), L)).
when(nonvar(L), check_on_fixed_length(L, maplist(=(a), L))).
... and then L = [_,_|Tail]:
?- check_on_fixed_length(L, maplist(=(a), L)), L = [_,_|L1].
L = [_G2887, _G2890|L1],
when(nonvar(L1), check_on_fixed_length([_G2887, _G2890|L1], maplist(=(a), [_G2887, _G2890|L1]))).
?- check_on_fixed_length(L, maplist(=(a), L)), L = [_,_|L1], length(L1, 3).
L = [a, a, a, a, a],
L1 = [a, a, a].
Impurity
conditon_fixed_length/2 is the source of impurity as it can be seen from the following query:
?- L = [X, Y|Tail], condition_fixed_length(L, Cond), L = [a,a].
L = [a, a],
X = Y, Y = a,
Tail = [],
Cond = nonvar([]).
?- L = [X, Y|Tail], L = [a, a], condition_fixed_length(L, Cond).
false.

Check predicate

I've a problem.
I have 5 constants.
C(1).
C(2).
C(3).
C(4).
C(5).
And I've a predicate named "check" that receives two arguments.
Example:
check( [C(1), C(3), C(4), _, C(5)], ListFinal).
And now it should give me
ListFinal = [C(1), C(3), C(4), C(2), C(5)].
How do I do this? How to check for that black space to put there, the constant I haven't used? It is possible to change the implementation of the constants.
You could try
check( [] , [] ) .
check( [c(X)|Xs] , [c(X)|Rs] ) :- c(X) , check(Xs,Rs) .
You might also look at findall/3.
You should note however, that your 'constants' aren't constants in prolog. The way you've written them they are are facts. And the ones you've listed aren't syntactically valid Prolog: The functor of a term must be either a bareword atom like c(3). or an atom enclosed in single quotes like 'C'(3). (though why anybody would voluntarily choose to do something like that is beyond me.)
check(L, C) :-
check(L, [], C).
check([], _, []).
check([c(X)|T], A, [c(X)|C]) :-
c(X),
\+ memberchk(c(X), A),
check(T, [c(X)|A], C).
Some tests:
| ?- check([_, c(3), c(4), _, c(5)], ListFinal).
ListFinal = [c(1),c(3),c(4),c(2),c(5)] ? a
ListFinal = [c(2),c(3),c(4),c(1),c(5)]
no
| ?- check([c(1), c(3), c(4), _, c(5)], ListFinal).
ListFinal = [c(1),c(3),c(4),c(2),c(5)] ? a
no
| ?-
Here's a DCG approach:
remap([c(X)|T], A) --> {c(X), \+ memberchk(c(X), A)}, [c(X)], remap(T, [c(X)|A]).
remap([], _) --> [].
check(L, C) :- phrase(remap(L, []), C).
once corrected the syntax, check each argument (easy to do with maplist/3)
check(In, Out) :-
exclude(var, In, NoVars),
maplist(check_var(NoVars), In, Out).
check_var(In, X, Y) :-
var(X) -> c(Z), \+ memberchk(c(Z), In), Y = c(Z) ; Y = X.
usage example
1 ?- check([c(1),X,c(3),c(5)],L).
L = [c(1), c(2), c(3), c(5)] ;
L = [c(1), c(4), c(3), c(5)] ;
false.

Prolog: how to do "check(a++b++c++d equals d++a++c++b) -> yes"

Let's define custom operators - let it be ++,equals
:- op(900, yfx, equals).
:- op(800, xfy, ++).
And fact:
check(A equals A).
I try to make predicate, let it be check/1, that will return true in all following situations:
check( a ++ b ++ c ++ d equals c ++ d ++ b ++ a ),
check( a ++ b ++ c ++ d equals d ++ a ++ c ++ b),
check( a ++ b ++ c ++ d equals d ++ b ++ c ++ a ),
% and all permutations... of any amount of atoms
check( a ++ ( b ++ c ) equals (c ++ a) ++ b),
% be resistant to any type of parentheses
return
yes
How to implement this in Prolog? (Code snippet, please. Is it possible? Am I missing something?)
Gnu-Prolog is preferred, but SWI-Prolog is acceptable as well.
P.S. Please treat code, as draft "pseudocode", and don't care for small syntax issues.
P.P.S '++' is just beginning. I'd like to add more operators. That's why I'm afraid that putting stuff into list might be not good solution.
Additionally
Additionally, would be nice, if queries would be possible (but, this part is not required, if you are able to answer to first part, it's great and enough)
check( a ++ (b ++ X) equals (c ++ Y) ++ b) )
one of possible results (thanks #mat for showing others)
X=c, Y=a
I am looking mostly for solution for first part of question - "yes/no" checking.
Second part with X,Y would be nice addition. In it X,Y should be simple atoms. For above example domains for X,Y are specified: domain(X,[a,b,c]),domain(Y,[a,b,c]).
Your representation is called "defaulty": In order to handle expressions of this form, you need a "default" case, or explicitly check for atom/1 (which is not monotonic) - you cannot use pattern matching directly to handle all cases. As a consequence, consider again your case:
check( a ++ (b ++ X) equals (c ++ Y) ++ b) )
You say this should answer X=c, Y=a. However, it could also answer X = (c ++ d), Y = (a ++ d). Should this solution also occur? If not, it would not be monotonic and thus significantly complicate declarative debugging and reasoning about your program. In your case, would it make sense to represent such expressions as lists? For example, [a,b,c,d] equals [c,d,b,a]? You could then simply use the library predicate permutation/2 to check for equality of such "expressions".
It is of course also possible to work with defaulty representations, and for many cases they might be more convenient for users (think of Prolog source code itself with its defaulty notation for goals, or Prolog arithmetic expressions). You can use non-monotonic predicates like var/1 and atom/1, and also term inspection predicates like functor/3 and (=..)/2 to systematically handle all cases, but they usually prevent or at least impede nice declarative solutions that can be used in all directions to test and also generate all cases.
This question is rather old, but I'll post my solution anyways. I'm learning prolog in my spare time, and found this quite a challenging problem.
I learned a lot about DCG and difference lists. I'm afraid, I didn't come up with a solution that does not use lists. Like mat suggested, it transforms terms into flat lists to cope with the parentheses, and uses permutation/2 to match the lists, accounting for the commutative nature of the ++ operator...
Here's what I came up with:
:- op(900, yfx, equ).
:- op(800, xfy, ++).
check(A equ B) :- A equ B.
equ(A,B) :- sum_pp(A,AL,Len), sum_pp(B,BL,Len), !, permutation(AL, BL).
sum_pp(Term, List, Len) :- sum_pp_(Term, List,[], 0,Len).
sum_pp_(A, [A|X],X, N0,N) :- nonvar(A), A\=(_++_), N is N0+1.
sum_pp_(A, [A|X],X, N0,N) :- var(A), N is N0+1.
sum_pp_(A1++A2, L1,L3, N0,N) :-
sum_pp_(A1, L1,L2, N0,N1), (nonvar(N), N1>=N -> !,fail; true),
sum_pp_(A2, L2,L3, N1,N).
The sum_pp/3 predicate is the workhorse which takes a term and transforms it into a flat list of summands, returning (or checking) the length, while being immune to parentheses. It is very similar to a DCG rule (using difference lists), but it is written as a regular predicate because it uses the length to help break the left recursion which would otherwise lead to infinite recursion (took me quite a while to beat it).
It can check ground terms:
?- sum_pp(((a++b)++x++y)++c++d, L, N).
L = [a,b,x,y,c,d],
N = 6 ;
false.
It also generates solutions:
?- sum_pp((b++X++y)++c, L, 5).
X = (_G908++_G909),
L = [b,_G908,_G909,y,c] ;
false.
?- sum_pp((a++X++b)++Y, L, 5).
Y = (_G935++_G936),
L = [a,X,b,_G935,_G936] ;
X = (_G920++_G921),
L = [a,_G920,_G921,b,Y] ;
false.
?- sum_pp(Y, L, N).
L = [Y],
N = 1 ;
Y = (_G827++_G828),
L = [_G827,_G828],
N = 2 ;
Y = (_G827++_G836++_G837),
L = [_G827,_G836,_G837],
N = 3 .
The equ/2 operator "unifies" two terms, and can also provide solutions if there are variables:
?- a++b++c++d equ c++d++b++a.
true ;
false.
?- a++(b++c) equ (c++a)++b.
true ;
false.
?- a++(b++X) equ (c++Y)++b.
X = c,
Y = a ;
false.
?- (a++b)++X equ c++Y.
X = c,
Y = (a++b) ;
X = c,
Y = (b++a) ;
false.
In the equ/2 rule
equ(A,B) :- sum_pp(A,AL,Len), sum_pp(B,BL,Len), !, permutation(AL, BL).
the first call to sum_pp generates a length, while the second call takes the length as a constraint. The cut is necessary, because the first call may continue to generate ever growing lists that will never again match with the second list, leading to infinite recursion. I haven't found a better solution yet...
Thanks for posting such an interesting problem!
/Peter
edit: sum_pp_ written as DCG rules:
sum_pp(Term, List, Len) :- sum_pp_(Term, 0,Len, List, []).
sum_pp_(A, N0,N) --> { nonvar(A), A\=(_++_), N is N0+1 }, [A].
sum_pp_(A, N0,N) --> { var(A), N is N0+1 }, [A].
sum_pp_(A1++A2, N0,N) -->
sum_pp_(A1, N0,N1), { nonvar(N), N1>=N -> !,fail; true },
sum_pp_(A2, N1,N).
update:
sum_pp(Term, List, Len) :-
( ground(Term)
-> sum_pp_chk(Term, 0,Len, List, []), ! % deterministic
; length(List, Len), Len>0,
sum_pp_gen(Term, 0,Len, List, [])
).
sum_pp_chk(A, N0,N) --> { A\=(_++_), N is N0+1 }, [A].
sum_pp_chk(A1++A2, N0,N) --> sum_pp_chk(A1, N0,N1), sum_pp_chk(A2, N1,N).
sum_pp_gen(A, N0,N) --> { nonvar(A), A\=(_++_), N is N0+1 }, [A].
sum_pp_gen(A, N0,N) --> { var(A), N is N0+1 }, [A].
sum_pp_gen(A1++A2, N0,N) -->
{ nonvar(N), N0+2>N -> !,fail; true }, sum_pp_gen(A1, N0,N1),
{ nonvar(N), N1+1>N -> !,fail; true }, sum_pp_gen(A2, N1,N).
I split sum_pp into two variants. The first is a slim version that checks ground terms and is deterministic. The second variant calls length/2 to perform some kind of iterative deepening, to prevent the left-recursion from running away before the right recurson gets a chance to produce something. Together with the length checks before each recursive call, this is now infinite recursion proof for many more cases than before.
In particular the following queries now work:
?- sum_pp(Y, L, N).
L = [Y],
N = 1 ;
Y = (_G1515++_G1518),
L = [_G1515,_G1518],
N = 2 .
?- sum_pp(Y, [a,b,c], N).
Y = (a++b++c),
N = 3 ;
Y = ((a++b)++c),
N = 3 ;
false.

Resources