Given a set and a pairwise xor set, find the second set? - algorithm

We take 2 sets of integers to generate a third set with contains the xor of every element in the first set with every element in the second set.
Now as a problem we have been given the first set and the third set i.e. the set with xors, and we need to generate the second set.
It is guaranteed that there is only one possible answer for the inputs
For Example:
(using binary here for clarity)
Inputs:
Set1: {101, 111}
Set3: {001, 011}
then Set2, the solution set will be
Set2: {110, 100}
as, if we do Set1 ^ Set2
{011, 001, 001, 011}
Important Points:
the inputs are sets, not arrays, so no repetitions
that doesn't mean that there weren't repetitions when set3 was created, a^d may be equal to b^c
there is no size constraint that set1 and set2 have to be of same size.
Also, my test case isn't that great as in this it looks like we can simply do set1^set3 to get the answer, but that is clearly not the correct way.

if you have set1 = (a, b) and set2 = (c, d)
Then you are provided with set3 = (a^c, a^d, b^c, b^d) and need to to find set2.
So we know that first element in set 3 is a^c, so to find c, we will do (a^c)^a => c
Then to find d, we will pick b^d from set3 and (b^d)^b(from set1) and we will get d.

Related

Need efficient algorithm in combinatorics

I am trying to find the best (realistic) algorithm for solving a cryptography challenge, in which:
the given cipher text C is made of about 6000 characters taken in the set S={A,B,C,...,Y,a,b,c,...y}. So |S| = 50.
the encryption scheme does not allow to have two identical adjacent characters in C
25 letters in S are called Nulls, and are unknown
these Nulls must be removed from C to obtain the actual cipher text C' which can then be attacked.
the list of Nulls in C is named N and |N| is close to |C|/2 = 3000
so: |N| + |C'| = |C|
My aim is to identify the 25 Nulls, satisfying these two conditions:
there may not be two identical adjacent characters in C'
there may not be two identical adjacent Nulls in N
Obviously by brute force there are 50!/(25! 25!) = 126410606437752 combinations of 25 Nulls in S, so this is not a realistic approach.
I have tried to recursively explore the tree of sets of Nulls and 'cut branches' as much and as soon as possible.
For example, when adding a letter of S to the subset of Nulls, if the sequence "x n1n2 x" appears in C where x is not yet a Null and n1n2 are Nulls, then x should be a Null too.
However this is not enough for a run-time lower than a few centuries...
Can you think of a more clever algorithm for identifying these 25 Nulls ?
Note: there might be more than one set of Nulls satisfying the two conditions
lets try something like this:
Create a list of sets - each set contains one char from S. the set is the null chars.
while you have more then two sets:
for each set
search the cipher text for X[<set-chars>]+X
if found, union the set with the set X in it.
if no sets where united, start recursing with two sets united.
You can speed up things if you keep a different cipher text for each set, removing from it the chars in the set. if you do so, the search is easier - you are searching for XX, witch is constant length. every time you union two sets you need to remove all the chars in the sets from the cipher text.
The time this well take depends on the string C you are given.
An explanation about the sets - each set is an option for C' or N. If you find that A and X are in the same group, then {A, X} is either a subset of N or of C'. If later you will find the same about Y and B, then {Y, B} is a subset. Later, finding a substring YAXAXY means that Y is in the same group as A and X, and so will B, because it's with Y. At the end you will end with two groups - one for C' and one for N, witch you can't distinguish between.
elyashiv's method is the good one.
It is very fast.
I have produced the two sets C' and N, which are equivalent.
The sub-sets of S, S1 and S2 which produce C' and N are adequately such that S = S1 U S2.
Thank you.

Find if any set is covered by member sets

[Please let me know if this maps to a known problem]
I have n sets of varying sizes. Each element in a set is unique. And each element can occur atmost in two different sets.
I want to perform an operation on these sets but avoid duplicates or missing any element.
Problem: Find out which all of these n sets should be removed because they are covered by other sets.
E.g. [a,b,c]; [a]; [b]. Remove [a], [b] since both are covered by the first one.
E.g. [a,b,c]; [a]; [b]; [c,d]. Remove [a,b,c] since all three elements are covered by remaining sets.
Note: here [a],[b] alone is not valid answer since 'c' is being duplicated. Similarly [a],[b],[c,d] is not valid answer since 'd' will be missed if removed.
I think that this is the Exact Cover problem. The last constraint—that each element is in at most two sets—doesn't seem to me to fundamentally change the problem (although I could easily be wrong about this). The Wikipedia web page contains a good summary of various algorithmic approaches. The algorithm of choice seems to be Dancing Links.
I think this is a case of a 2-sat problem that can be solved in linear time using a method based on Tarjan's algorithm.
Make a variable Ai for each set i. Ai is true if and only if set i is to be included.
For each element that appears in a single set add a clause that Ai=1
For each element that appears in 2 sets i and j, add clauses (Ai && ~Aj) || (~Ai && Aj). These clauses meant that exactly one of Ai and Aj must appear.
You can now solve this using a standard 2-sat algorithm to find whether this is impossible to achieve or a satisfying assignment if it is possible.
For a case with V sets and N elements you will have V variables and up to 2N clauses, so Tarjan's algorithm will have complexity O(V+2N).
Since an element in a set can appear in no more than two sets, then there are fairly straightforward connections between sets, which can be shown as a graph, the two examples are shown below. One example uses red lines to represent edges and the other uses black lines to represent edges.
The above shows that the sets can be divided into three groups.
Sets where all elements appear twice. These sets could potentially be removed and/or the sets that contain those elements could be removed.
Sets where one or more elements appear twice. The elements that appear twice could potentially link to sets that could be removed.
Sets where no elements appear twice. These sets can be ignored.
It's not really clear what happens if all of the sets are in either group 1 or group 3. However there seems to be a fairly simple criterion that allows for quickly removing sets, and the psudocode looks like so:
for each set in group2:
for each element that appears twice in that set:
if the other set that contains that element is in group1:
remove the other set
The performance is then linear in the number of elements.
I tried to find which sets to include rather than remove. Something like this?
(1) List of elements and the indexes of sets they are in
(2) Prime the answer list with indexes of sets that have elements that appear only in them
(3) Comb the map from (1) and if an element's set-index is not in the answer list, add to the answer the index of the smallest set that element is in.
Haskell code:
import Data.List (nub, minimumBy, intersect)
sets = [["a","b","c","e"],["a","b","d"],["a","c","e"]]
lengths = map length sets
--List elements and the indexes of sets they are in
mapped = foldr map [] (nub . concat $ sets) where
map a b = comb (a,[]) sets 0 : b
comb result [] _ = result
comb (a,list) (x:xs) index | elem a x = comb (a,index:list) xs (index + 1)
| otherwise = comb (a,list) xs (index + 1)
--List indexes of sets that have elements that appear only in them
haveUnique = map (head . snd)
. filter (\(element,list) -> null . drop 1 $ list)
$ mapped
--Comb the map and if an element's set-index is not in the answer list,
--add to the answer the index of the smallest set that element is in.
answer = foldr comb haveUnique mapped where
comb (a,list) b
| not . null . intersect list $ b = b
| otherwise =
minimumBy (\setIndexA setIndexB ->
compare (lengths!!setIndexA) (lengths!!setIndexB)) list : b
OUTPUT:
*Main> sets
[["a","b","c","e"],["a","b","d"],["a","c","e"]]
*Main> mapped
[("a",[2,1,0]),("b",[1,0]),("c",[2,0]),("e",[2,0]),("d",[1])]
*Main> haveUnique
[1]
*Main> answer
[2,1]

Algorithm to separate items of the same type

I have a list of elements, each one identified with a type, I need to reorder the list to maximize the minimum distance between elements of the same type.
The set is small (10 to 30 items), so performance is not really important.
There's no limit about the quantity of items per type or quantity of types, the data can be considered random.
For example, if I have a list of:
5 items of A
3 items of B
2 items of C
2 items of D
1 item of E
1 item of F
I would like to produce something like:
A, B, C, A, D, F, B, A, E, C, A, D, B, A
A has at least 2 items between occurences
B has at least 4 items between occurences
C has 6 items between occurences
D has 6 items between occurences
Is there an algorithm to achieve this?
-Update-
After exchanging some comments, I came to a definition of a secondary goal:
main goal: maximize the minimum distance between elements of the same type, considering only the type(s) with less distance.
secondary goal: maximize the minimum distance between elements on every type. IE: if a combination increases the minimum distance of a certain type without decreasing other, then choose it.
-Update 2-
About the answers.
There were a lot of useful answers, although none is a solution for both goals, specially the second one which is tricky.
Some thoughts about the answers:
PengOne: Sounds good, although it doesn't provide a concrete implementation, and not always leads to the best result according to the second goal.
Evgeny Kluev: Provides a concrete implementation to the main goal, but it doesn't lead to the best result according to the secondary goal.
tobias_k: I liked the random approach, it doesn't always lead to the best result, but it's a good approximation and cost effective.
I tried a combination of Evgeny Kluev, backtracking, and tobias_k formula, but it needed too much time to get the result.
Finally, at least for my problem, I considered tobias_k to be the most adequate algorithm, for its simplicity and good results in a timely fashion. Probably, it could be improved using Simulated annealing.
First, you don't have a well-defined optimization problem yet. If you want to maximized the minimum distance between two items of the same type, that's well defined. If you want to maximize the minimum distance between two A's and between two B's and ... and between two Z's, then that's not well defined. How would you compare two solutions:
A's are at least 4 apart, B's at least 4 apart, and C's at least 2 apart
A's at least 3 apart, B's at least 3 apart, and C's at least 4 apart
You need a well-defined measure of "good" (or, more accurately, "better"). I'll assume for now that the measure is: maximize the minimum distance between any two of the same item.
Here's an algorithm that achieves a minimum distance of ceiling(N/n(A)) where N is the total number of items and n(A) is the number of items of instance A, assuming that A is the most numerous.
Order the item types A1, A2, ... , Ak where n(Ai) >= n(A{i+1}).
Initialize the list L to be empty.
For j from k to 1, distribute items of type Ak as uniformly as possible in L.
Example: Given the distribution in the question, the algorithm produces:
F
E, F
D, E, D, F
D, C, E, D, C, F
B, D, C, E, B, D, C, F, B
A, B, D, A, C, E, A, B, D, A, C, F, A, B
This sounded like an interesting problem, so I just gave it a try. Here's my super-simplistic randomized approach, done in Python:
def optimize(items, quality_function, stop=1000):
no_improvement = 0
best = 0
while no_improvement < stop:
i = random.randint(0, len(items)-1)
j = random.randint(0, len(items)-1)
copy = items[::]
copy[i], copy[j] = copy[j], copy[i]
q = quality_function(copy)
if q > best:
items, best = copy, q
no_improvement = 0
else:
no_improvement += 1
return items
As already discussed in the comments, the really tricky part is the quality function, passed as a parameter to the optimizer. After some trying I came up with one that almost always yields optimal results. Thank to pmoleri, for pointing out how to make this a whole lot more efficient.
def quality_maxmindist(items):
s = 0
for item in set(items):
indcs = [i for i in range(len(items)) if items[i] == item]
if len(indcs) > 1:
s += sum(1./(indcs[i+1] - indcs[i]) for i in range(len(indcs)-1))
return 1./s
And here some random result:
>>> print optimize(items, quality_maxmindist)
['A', 'B', 'C', 'A', 'D', 'E', 'A', 'B', 'F', 'C', 'A', 'D', 'B', 'A']
Note that, passing another quality function, the same optimizer could be used for different list-rearrangement tasks, e.g. as a (rather silly) randomized sorter.
Here is an algorithm that only maximizes the minimum distance between elements of the same type and does nothing beyond that. The following list is used as an example:
AAAAA BBBBB CCCC DDDD EEEE FFF GG
Sort element sets by number of elements of each type in descending order. Actually only largest sets (A & B) should be placed to the head of the list as well as those element sets that have one element less (C & D & E). Other sets may be unsorted.
Reserve R last positions in the array for one element from each of the largest sets, divide the remaining array evenly between the S-1 remaining elements of the largest sets. This gives optimal distance: K = (N - R) / (S - 1). Represent target array as a 2D matrix with K columns and L = N / K full rows (and possibly one partial row with N % K elements). For example sets we have R = 2, S = 5, N = 27, K = 6, L = 4.
If matrix has S - 1 full rows, fill first R columns of this matrix with elements of the largest sets (A & B), otherwise sequentially fill all columns, starting from last one.
For our example this gives:
AB....
AB....
AB....
AB....
AB.
If we try to fill the remaining columns with other sets in the same order, there is a problem:
ABCDE.
ABCDE.
ABCDE.
ABCE..
ABD
The last 'E' is only 5 positions apart from the first 'E'.
Sequentially fill all columns, starting from last one.
For our example this gives:
ABFEDC
ABFEDC
ABFEDC
ABGEDC
ABG
Returning to linear array we have:
ABFEDCABFEDCABFEDCABGEDCABG
Here is an attempt to use simulated annealing for this problem (C sources): http://ideone.com/OGkkc.
I believe you could see your problem like a bunch of particles that physically repel eachother. You could iterate to a 'stable' situation.
Basic pseudo-code:
force( x, y ) = 0 if x.type==y.type
1/distance(x,y) otherwise
nextposition( x, force ) = coined?(x) => same
else => x + force
notconverged(row,newrow) = // simplistically
row!=newrow
row=[a,b,a,b,b,b,a,e];
newrow=nextposition(row);
while( notconverged(row,newrow) )
newrow=nextposition(row);
I don't know if it converges, but it's an idea :)
I'm sure there may be a more efficient solution, but here is one possibility for you:
First, note that it is very easy to find an ordering which produces a minimum-distance-between-items-of-same-type of 1. Just use any random ordering, and the MDBIOST will be at least 1, if not more.
So, start off with the assumption that the MDBIOST will be 2. Do a recursive search of the space of possible orderings, based on the assumption that MDBIOST will be 2. There are a number of conditions you can use to prune branches from this search. Terminate the search if you find an ordering which works.
If you found one that works, try again, under the assumption that MDBIOST will be 3. Then 4... and so on, until the search fails.
UPDATE: It would actually be better to start with a high number, because that will constrain the possible choices more. Then gradually reduce the number, until you find an ordering which works.
Here's another approach.
If every item must be kept at least k places from every other item of the same type, then write down items from left to right, keeping track of the number of items left of each type. At each point put down an item with the largest number left that you can legally put down.
This will work for N items if there are no more than ceil(N / k) items of the same type, as it will preserve this property - after putting down k items we have k less items and we have put down at least one of each type that started with at ceil(N / k) items of that type.
Given a clutch of mixed items you could work out the largest k you can support and then lay out the items to solve for this k.

Algorithm/Data Structure for finding combinations of minimum values easily

I have a symmetric matrix like shown in the image attached below.
I've made up the notation A.B which represents the value at grid point (A, B). Furthermore, writing A.B.C gives me the minimum grid point value like so: MIN((A,B), (A,C), (B,C)).
As another example A.B.D gives me MIN((A,B), (A,D), (B,D)).
My goal is to find the minimum values for ALL combinations of letters (not repeating) for one row at a time e.g for this example I need to find min values with respect to row A which are given by the calculations:
A.B = 6
A.C = 8
A.D = 4
A.B.C = MIN(6,8,6) = 6
A.B.D = MIN(6, 4, 4) = 4
A.C.D = MIN(8, 4, 2) = 2
A.B.C.D = MIN(6, 8, 4, 6, 4, 2) = 2
I realize that certain calculations can be reused which becomes increasingly important as the matrix size increases, but the problem is finding the most efficient way to implement this reuse.
Can point me in the right direction to finding an efficient algorithm/data structure I can use for this problem?
You'll want to think about the lattice of subsets of the letters, ordered by inclusion. Essentially, you have a value f(S) given for every subset S of size 2 (that is, every off-diagonal element of the matrix - the diagonal elements don't seem to occur in your problem), and the problem is to find, for each subset T of size greater than two, the minimum f(S) over all S of size 2 contained in T. (And then you're interested only in sets T that contain a certain element "A" - but we'll disregard that for the moment.)
First of all, note that if you have n letters, that this amounts to asking Omega(2^n) questions, roughly one for each subset. (Excluding the zero- and one-element subsets and those that don't include "A" saves you n + 1 sets and a factor of two, respectively, which is allowed for big Omega.) So if you want to store all these answers for even moderately large n, you'll need a lot of memory. If n is large in your applications, it might be best to store some collection of pre-computed data and do some computation whenever you need a particular data point; I haven't thought about what would work best, but for example computing data only for a binary tree contained in the lattice would not necessarily help you anything beyond precomputing nothing at all.
With these things out of the way, let's assume you actually want all the answers computed and stored in memory. You'll want to compute these "layer by layer", that is, starting with the three-element subsets (since the two-element subsets are already given by your matrix), then four-element, then five-element, etc. This way, for a given subset S, when we're computing f(S) we will already have computed all f(T) for T strictly contained in S. There are several ways that you can make use of this, but I think the easiest might be to use two such subset S: let t1 and t2 be two different elements of T that you may select however you like; let S be the subset of T that you get when you remove t1 and t2. Write S1 for S plus t1 and write S2 for S plus t2. Now every pair of letters contained in T is either fully contained in S1, or it is fully contained in S2, or it is {t1, t2}. Look up f(S1) and f(S2) in your previously computed values, then look up f({t1, t2}) directly in the matrix, and store f(T) = the minimum of these 3 numbers.
If you never select "A" for t1 or t2, then indeed you can compute everything you're interested in while not computing f for any sets T that don't contain "A". (This is possible because the steps outlined above are only interesting whenever T contains at least three elements.) Good! This leaves just one question - how to store the computed values f(T). What I would do is use a 2^(n-1)-sized array; represent each subset-of-your-alphabet-that-includes-"A" by the (n-1) bit number where the ith bit is 1 whenever the (i+1)th letter is in that set (so 0010110, which has bits 2, 4, and 5 set, represents the subset {"A", "C", "D", "F"} out of the alphabet "A" .. "H" - note I'm counting bits starting at 0 from the right, and letters starting at "A" = 0). This way, you can actually iterate through the sets in numerical order and don't need to think about how to iterate through all k-element subsets of an n-element set. (You do need to include a special case for when the set under consideration has 0 or 1 element, in which case you'll want to do nothing, or 2 elements, in which case you just copy the value from the matrix.)
Well, it looks simple to me, but perhaps I misunderstand the problem. I would do it like this:
let P be a pattern string in your notation X1.X2. ... .Xn, where Xi is a column in your matrix
first compute the array CS = [ (X1, X2), (X1, X3), ... (X1, Xn) ], which contains all combinations of X1 with every other element in the pattern; CS has n-1 elements, and you can easily build it in O(n)
now you must compute min (CS), i.e. finding the minimum value of the matrix elements corresponding to the combinations in CS; again you can easily find the minimum value in O(n)
done.
Note: since your matrix is symmetric, given P you just need to compute CS by combining the first element of P with all other elements: (X1, Xi) is equal to (Xi, X1)
If your matrix is very large, and you want to do some optimization, you may consider prefixes of P: let me explain with an example
when you have solved the problem for P = X1.X2.X3, store the result in an associative map, where X1.X2.X3 is the key
later on, when you solve a problem P' = X1.X2.X3.X7.X9.X10.X11 you search for the longest prefix of P' in your map: you can do this by starting with P' and removing one component (Xi) at a time from the end until you find a match in your map or you end up with an empty string
if you find a prefix of P' in you map then you already know the solution for that problem, so you just have to find the solution for the problem resulting from combining the first element of the prefix with the suffix, and then compare the two results: in our example the prefix is X1.X2.X3, and so you just have to solve the problem for
X1.X7.X9.X10.X11, and then compare the two values and choose the min (don't forget to update your map with the new pattern P')
if you don't find any prefix, then you must solve the entire problem for P' (and again don't forget to update the map with the result, so that you can reuse it in the future)
This technique is essentially a form of memoization.

Generating Balls in Boxes

Given two sorted vectors a and b, find all vectors which are sums of a and some permutation of b, and which are unique once sorted.
You can create one of the sought vectors in the following way:
Take vector a and a permutation of vector b.
Sum them together so c[i]=a[i]+b[i].
Sort c.
I'm interested in finding the set of b-permutations that yield the entire set of unique c vectors.
Example 0: a='ccdd' and b='xxyy'
Gives the summed vectors: 'cycydxdx', 'cxcxdydy', 'cxcydxdy'.
Notice that the permutations of b: 'xyxy' and 'yxyx' are equal, because in both cases the "box c" and the "box d" both get exactly one 'x' and one 'y'.
I guess this is similar to putting M balls in M boxes (one in each) with some groups of balls and boxes being identical.
Update: Given a string a='aabbbcdddd' and b='xxyyzzttqq' your problem will be 10 balls in 4 boxes. There are 4 distinct boxes of size 2, 3, 1 and 4. The balls are pair wise indistinguishable.
Example 1: Given strings are a='xyy' and b='kkd'.
Possible solution: 'kkd', 'dkk'.
Reason: We see that all unique permutations of b are 'kkd', 'kdk' and 'dkk'. However with our restraints, the two first permutations are considered equal as the indices on which the differ maps to the same char 'y' in string a.
Example 2: Given strings are a='xyy' and b='khd'.
Possible solution: 'khd', 'dkh', 'hkd'.
Example 3: Given strings are a='xxxx' and b='khhd'.
Possible solution: 'khhd'.
I can solve the problem of generating unique candidate b permutations using Narayana Pandita's algorithm as decribed on Wikipedia/Permutation.
The second part seams harder. My best shot is to join the two strings pairwise to a list, sort it and use it as a key in a lookup set. ('xx'+'hd' join→'xh','xd' sort→'xd','xh').
As my M is often very big, and as similarities in the strings are common, I currently generate way more b permutations than actually goes through the set filter. I would love to have an algorithm generating the correct ones directly. Any improvement is welcome.
To generate k-combinations of possibly repeated elements (multiset), the following could be useful: A Gray Code for Combinations of a Multiset (1995).
For a recursive solution you try the following:
Count the number of times each character appears. Say they are x1 x2 ... xm, corresponding to m distinct characters.
Then you need to find all possible ordered pairs (y1 y2 ... ym) such that
0 <= yi <= xi
and Sum yi = k.
Here yi is the number of times character i appears.
The idea is, fix the number of times char 1 appears (y1). Then recursively generate all combinations of k-y1 from the remaining.
psuedocode:
List Generate (int [] x /* array index starting at 1*/,
int k /* size of set */) {
list = List.Empty;
if (Sum(x) < k) return list;
for (int i = 0; i <= x[1], i++) {
// Remove first element and generate subsets of size k-i.
remaining = x.Remove(1);
list_i = Generate(remaining, k-i);
if (list_i.NotEmpty()) {
list = list + list_i;
} else {
return list;
}
}
return list;
}
PRIOR TO EDITS:
If I understood it correctly, you need to look at string a, see the symbols that appear exactly once. Say there are k such symbols. Then you need to generate all possible permutations of b, which contain k elements and map to those symbols at the corresponding positions. The rest you can ignore/fill in as you see fit.
I remember posting C# code for that here: How to find permutation of k in a given length?
I am assuming xxyy will give only 1 unique string and the ones that appear exactly once are the 'distinguishing' points.
Eg in case of a=xyy, b=add
distinguishing point is x
So you select permuations of 'add' of length 1. Those gives you a and d.
Thus add and dad (or dda) are the ones you need.
For a=xyyz b=good
distinguishing points are x and z
So you generate permutations of b of length 2 giving
go
og
oo
od
do
gd
dg
giving you 7 unique permutations.
Does that help? Is my understanding correct?
Ok, I'm sorry I never was able to clearly explain the problem, but here is a solution.
We need two functions combinations and runvector(v). combinations(s,k) generates the unique combinations of a multiset of a length k. For s='xxyy' these would be ['xx','xy','yy']. runvector(v) transforms a multiset represented as a sorted vector into a more simple structure, the runvector. runvector('cddeee')=[1,2,3].
To solve the problem, we will use recursive generators. We run through all the combinations that fits in box1 and the recourse on the rest of the boxes, banning the values we already chose. To accomplish the banning, combinations will maintain a bitarray across of calls.
In python the approach looks like this:
def fillrest(banned,out,rv,b,i):
if i == len(rv):
yield None
return
for comb in combinations(b,rv[i],banned):
out[i] = comb
for rest in fillrest(banned,out,rv,b,i+1):
yield None
def balls(a,b):
rv = runvector(a)
banned = [False for _ in b]
out = [None for _ in rv]
for _ in fill(out,rv,0,b,banned):
yield out[:]
>>> print list(balls('abbccc','xyyzzz'))
[['x', 'yy', 'zzz'],
['x', 'yz', 'yzz'],
['x', 'zz', 'yyz'],
['y', 'xy', 'zzz'],
['y', 'xz', 'yzz'],
['y', 'yz', 'xzz'],
['y', 'zz', 'xyz'],
['z', 'xy', 'yzz'],
['z', 'xz', 'yyz'],
['z', 'yy', 'xzz'],
['z', 'yz', 'xyz'],
['z', 'zz', 'xyy']]
The output are in 'box' format, but can easily be merged back to simple strings: 'xyyzzzz', 'xyzyzz'...

Resources