What is the best way to find the period of a (repeating) list in Mathematica? - algorithm

What is the best way to find the period in a repeating list?
For example:
a = {4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2}
has repeat {4, 5, 1, 2, 3} with the remainder {4, 5, 1, 2} matching, but being incomplete.
The algorithm should be fast enough to handle longer cases, like so:
b = RandomInteger[10000, {100}];
a = Join[b, b, b, b, Take[b, 27]]
The algorithm should return $Failed if there is no repeating pattern like above.

Please see the comments interspersed with the code on how it works.
(* True if a has period p *)
testPeriod[p_, a_] := Drop[a, p] === Drop[a, -p]
(* are all the list elements the same? *)
homogeneousQ[list_List] := Length#Tally[list] === 1
homogeneousQ[{}] := Throw[$Failed] (* yes, it's ugly to put this here ... *)
(* auxiliary for findPeriodOfFirstElement[] *)
reduce[a_] := Differences#Flatten#Position[a, First[a], {1}]
(* the first element occurs every ?th position ? *)
findPeriodOfFirstElement[a_] := Module[{nl},
nl = NestWhileList[reduce, reduce[a], ! homogeneousQ[#] &];
Fold[Total#Take[#2, #1] &, 1, Reverse[nl]]
]
(* the period must be a multiple of the period of the first element *)
period[a_] := Catch#With[{fp = findPeriodOfFirstElement[a]},
Do[
If[testPeriod[p, a], Return[p]],
{p, fp, Quotient[Length[a], 2], fp}
]
]
Please ask if findPeriodOfFirstElement[] is not clear. I did this independently (for fun!), but now I see that the principle is the same as in Verbeia's solution, except the problem pointed out by Brett is fixed.
I was testing with
b = RandomInteger[100, {1000}];
a = Flatten[{ConstantArray[b, 1000], Take[b, 27]}];
(Note the low integer values: there will be lots of repeating elements within the same period *)
EDIT: According to Leonid's comment below, another 2-3x speedup (~2.4x on my machine) is possible by using a custom position function, compiled specifically for lists of integers:
(* Leonid's reduce[] *)
myPosition = Compile[
{{lst, _Integer, 1}, {val, _Integer}},
Module[{pos = Table[0, {Length[lst]}], i = 1, ctr = 0},
For[i = 1, i <= Length[lst], i++,
If[lst[[i]] == val, pos[[++ctr]] = i]
];
Take[pos, ctr]
],
CompilationTarget -> "C", RuntimeOptions -> "Speed"
]
reduce[a_] := Differences#myPosition[a, First[a]]
Compiling testPeriod gives a further ~20% speedup in a quick test, but I believe this will depend on the input data:
Clear[testPeriod]
testPeriod =
Compile[{{p, _Integer}, {a, _Integer, 1}},
Drop[a, p] === Drop[a, -p]]

Above methods are better if you have no noise. If your signal is only approximate then Fourier transform methods might be useful. I'll illustrate with a "parametrized" setup wherein the length and number of repetitions of the base signal, the length of the trailing part, and a bound on the noise perturbation are all variables one can play with.
noise = 20;
extra = 40;
baselen = 103;
base = RandomInteger[10000, {baselen}];
repeat = 5;
signal = Flatten[Join[ConstantArray[base, repeat], Take[base, extra]]];
noisysignal = signal + RandomInteger[{-noise, noise}, Length[signal]];
We compute the absolute value of the FFT. We adjoin zeros to both ends. The object will be to threshold by comparing to neighbors.
sigfft = Join[{0.}, Abs[Fourier[noisysignal]], {0}];
Now we create two 0-1 vectors. In one we threshold by making a 1 for each element in the fft that is greater than twice the geometric mean of its two neighbors. In the other we use the average (arithmetic mean) but we lower the size bound to 3/4. This was based on some experimentation. We count the number of 1s in each case. Ideally we'd get 100 for each, as that would be the number of nonzeros in a "perfect" case of no noise and no tail part.
In[419]:=
thresh1 =
Table[If[sigfft[[j]]^2 > 2*sigfft[[j - 1]]*sigfft[[j + 1]], 1,
0], {j, 2, Length[sigfft] - 1}];
count1 = Count[thresh1, 1]
thresh2 =
Table[If[sigfft[[j]] > 3/4*(sigfft[[j - 1]] + sigfft[[j + 1]]), 1,
0], {j, 2, Length[sigfft] - 1}];
count2 = Count[thresh2, 1]
Out[420]= 114
Out[422]= 100
Now we get our best guess as to the value of "repeats", by taking the floor of the total length over the average of our counts.
approxrepeats = Floor[2*Length[signal]/(count1 + count2)]
Out[423]= 5
So we have found that the basic signal is repeated 5 times. That can give a start toward refining to estimate the correct length (baselen, above). To that end we might try removing elements at the end and seeing when we get ffts closer to actually having runs of four 0s between nonzero values.
Something else that might work for estimating number of repeats is finding the modal number of zeros in run length encoding of the thresholded ffts. While I have not actually tried that, it looks like it might be robust to bad choices in the details of how one does the thresholding (mine were just experiments that seem to work).
Daniel Lichtblau

The following assumes that the cycle starts on the first element and gives the period length and the cycle.
findCyclingList[a_?VectorQ] :=
Module[{repeats1, repeats2, cl, cLs, vec},
repeats1 = Flatten#Differences[Position[a, First[a]]];
repeats2 = Flatten[Position[repeats1, First[repeats1]]];
If[Equal ## Differences[repeats2] && Length[repeats2] > 2(*
is potentially cyclic - first element appears cyclically *),
cl = Plus ### Partition[repeats1, First[Differences[repeats2]]];
cLs = Partition[a, First[cl]];
If[SameQ ## cLs (* candidate cycles all actually the same *),
vec = First[cLs];
{Length[vec], vec}, $Failed], $Failed] ]
Testing
b = RandomInteger[50, {100}];
a = Join[b, b, b, b, Take[b, 27]];
findCyclingList[a]
{100, {47, 15, 42, 10, 14, 29, 12, 29, 11, 37, 6, 19, 14, 50, 4, 38,
23, 3, 41, 39, 41, 17, 32, 8, 18, 37, 5, 45, 38, 8, 39, 9, 26, 33,
40, 50, 0, 45, 1, 48, 32, 37, 15, 37, 49, 16, 27, 36, 11, 16, 4, 28,
31, 46, 30, 24, 30, 3, 32, 31, 31, 0, 32, 35, 47, 44, 7, 21, 1, 22,
43, 13, 44, 35, 29, 38, 31, 31, 17, 37, 49, 22, 15, 28, 21, 8, 31,
42, 26, 33, 1, 47, 26, 1, 37, 22, 40, 27, 27, 16}}
b1 = RandomInteger[10000, {100}];
a1 = Join[b1, b1, b1, b1, Take[b1, 23]];
findCyclingList[a1]
{100, {1281, 5325, 8435, 7505, 1355, 857, 2597, 8807, 1095, 4203,
3718, 3501, 7054, 4620, 6359, 1624, 6115, 8567, 4030, 5029, 6515,
5921, 4875, 2677, 6776, 2468, 7983, 4750, 7609, 9471, 1328, 7830,
2241, 4859, 9289, 6294, 7259, 4693, 7188, 2038, 3994, 1907, 2389,
6622, 4758, 3171, 1746, 2254, 556, 3010, 1814, 4782, 3849, 6695,
4316, 1548, 3824, 5094, 8161, 8423, 8765, 1134, 7442, 8218, 5429,
7255, 4131, 9474, 6016, 2438, 403, 6783, 4217, 7452, 2418, 9744,
6405, 8757, 9666, 4035, 7833, 2657, 7432, 3066, 9081, 9523, 3284,
3661, 1947, 3619, 2550, 4950, 1537, 2772, 5432, 6517, 6142, 9774,
1289, 6352}}
This case should fail because it isn't cyclical.
findCyclingList[Join[b, Take[b, 11], b]]
$Failed
I tried to something with Repeated, e.g. a /. Repeated[t__, {2, 100}] -> {t} but it just doesn't work for me.

Does this work for you?
period[a_] :=
Quiet[Check[
First[Cases[
Table[
{k, Equal ## Partition[a, k]},
{k, Floor[Length[a]/2]}],
{k_, True} :> k
]],
$Failed]]
Strictly speaking, this will fail for things like
a = {1, 2, 3, 1, 2, 3, 1, 2, 3, 4, 5}
although this can be fixed by using something like:
(Equal ## Partition[a, k]) && (Equal ## Partition[Reverse[a], k])
(probably computing Reverse[a] just once ahead of time.)

I propose this. It borrows from both Verbeia and Brett's answers.
Do[
If[MatchQ ## Equal ## Partition[#, i, i, 1, _], Return ## i],
{i, #[[ 2 ;; Floor[Length##/2] ]] ~Position~ First##}
] /. Null -> $Failed &
It is not quite as efficient as Vebeia's function on long periods, but it is faster on short ones, and it is simpler as well.

I don't know how to solve it in mathematica, but the following algorithm (written in python) should work. It's O(n) so speed should be no concern.
def period(array):
if len(array) == 0:
return False
else:
s = array[0]
match = False
end = 0
i = 0
for k in range(1,len(array)):
c = array[k]
if not match:
if c == s:
i = 1
match = True
end = k
else:
if not c == array[i]:
match = False
i += 1
if match:
return array[:end]
else:
return False
# False
print(period([4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2,1]))
# [4, 5, 1, 2, 3]
print(period([4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2]))
# False
print(period([4]))
# [4, 2]
print(period([4,2,4]))
# False
print(period([4,2,1]))
# False
print(period([]))

Ok, just to show my own work here:
ModifiedTortoiseHare[a_List] := Module[{counter, tortoise, hare},
Quiet[
Check[
counter = 1;
tortoise = a[[counter]];
hare = a[[2 counter]];
While[(tortoise != hare) || (a[[counter ;; 2 counter - 1]] != a[[2 counter ;; 3 counter - 1]]),
counter++;
tortoise = a[[counter]];
hare = a[[2 counter]];
];
counter,
$Failed]]]
I'm not sure this is a 100% correct, especially with cases like {pattern,pattern,different,pattern, pattern} and it gets slower and slower when there are a lot of repeating elements, like so:
{ 1,2,1,1, 1,2,1,1, 1,2,1,1, ...}
because it is making too many expensive comparisons.

#include <iostream>
#include <vector>
using namespace std;
int period(vector<int> v)
{
int p=0; // period 0
for(int i=p+1; i<v.size(); i++)
{
if(v[i] == v[0])
{
p=i; // new potential period
bool periodical=true;
for(int i=0; i<v.size()-p; i++)
{
if(v[i]!=v[i+p])
{
periodical=false;
break;
}
}
if(periodical) return p;
i=p; // try to find new period
}
}
return 0; // no period
}
int main()
{
vector<int> v3{1,2,3,1,2,3,1,2,3};
cout<<"Period is :\t"<<period(v3)<<endl;
vector<int> v0{1,2,3,1,2,3,1,9,6};
cout<<"Period is :\t"<<period(v0)<<endl;
vector<int> v1{1,2,1,1,7,1,2,1,1,7,1,2,1,1};
cout<<"Period is :\t"<<period(v1)<<endl;
return 0;
}

This sounds like it might relate to sequence alignment. These algorithms are well studied, and might already be implemented in mathematica.

Related

Sorting the main diagonal of a matrix

I have a matrix, and its elements on the main diagonal aren't sorted, so I need a function that will return new matrix with sorted elements on the main diagonal. I can't understand why this won't work.
Function[A_] := Module[{res, diagonal = {}, m, n},
{m, n} = Dimensions[A];
Table[AppendTo[diagonal, A[[i, i]]], {i, 1, m}];
dijagonal = SelectionSort[diagonal];
Table[A[[i, i]] = dijagonal[[i]], {i, 1, m}];
Return[A // MatrixForm];
];
Selection sort works.
This can be an example of matrix:
A={{60, 10, 68, 72, 64},{26, 70, 32, 19, 29},{94, 78, 86, 59, 17},
{77, 13, 34, 39, 0}, {31, 71, 11, 48, 83}}
When I run it, this shows up:
Set::setps: {{60,10,68,72,64},{26,70,32,19,29},{94,78,86,59,17},{77,13,34,39,0},{31,71,11,48,83}} in the part assignment is not a symbol. >>
The main issues
you can not define your own function named Function. (Generally avoid using any symbol beginning with a capital to avoid conflicts )
you can not modify the the input argument, so make a copy
just use Sort not SelectionSort
I made a couple of other changes that are more stylistic as well:
function[A0_] :=
Module[{res, diagonal = {}, m, n, A = A0},
{m, n} = Dimensions[A];
diagonal = Table[A[[i, i]], {i, 1, m}];
dijagonal = Sort[diagonal];
Do[A[[i, i]] = dijagonal[[i]], {i, 1, m}]; A]
function[A] // MatrixForm
note you can do all this inline:
ReplacePart[ A,
Table[ {i, i} -> (Sort#Diagonal[A])[[i]], {i, Length#A} ]]
or
(A + DiagonalMatrix[Sort## - # &#Diagonal[A]])

Definition lookup speed: a performance issue

I have the following problem.
I need to build a very large number of definitions(*) such as
f[{1,0,0,0}] = 1
f[{0,1,0,0}] = 2
f[{0,0,1,0}] = 3
f[{0,0,0,1}] = 2
...
f[{2,3,1,2}] = 4
...
f[{n1,n2,n3,n4}] = some integer
...
This is just an example. The length of the argument list does not need to be 4 but can be anything.
I realized that the lookup for each value slows down with exponential complexity when the length of the argument list increases. Perhaps this is not so strange, since it is clear that in principle there is a combinatorial explosion in how many definitions Mathematica needs to store.
Though, I have expected Mathematica to be smart and that value extract should be constant time complexity. Apparently it is not.
Is there any way to speed up lookup time? This probably has to do with how Mathematica internally handles symbol definition lookups. Does it phrases the list until it finds the match? It seems that it does so.
All suggestions highly appreciated.
With best regards
Zoran
(*) I am working on a stochastic simulation software that generates all configurations of a system and needs to store how many times each configuration occurred. In that sense a list {n1, n2, ..., nT} describes a particular configuration of the system saying that there are n1 particles of type 1, n2 particles of type 2, ..., nT particles of type T. There can be exponentially many such configurations.
Could you give some detail on how you worked out that lookup time is exponential?
If it is indeed exponential, perhaps you could speed things up by using Hash on your keys (configurations), then storing key-value pairs in a list like {{key1,value1},{key2,value2}}, kept sorted by key and then using binary search (which should be log time). This should be very quick to code up but not optimum in terms of speed.
If that's not fast enough, one could think about setting up a proper hashtable implementation (which I thought was what the f[{0,1,0,1}]=3 approach did, without having checked).
But some toy example of the slowdown would be useful to proceed further...
EDIT: I just tried
test[length_] := Block[{f},
Do[
f[RandomInteger[{0, 10}, 100]] = RandomInteger[0, 10];,
{i, 1, length}
];
f[{0, 0, 0, 0, 1, 7, 0, 3, 7, 8, 0, 4, 5, 8, 0, 8, 6, 7, 7, 0, 1, 6,
3, 9, 6, 9, 2, 7, 2, 8, 1, 1, 8, 4, 0, 5, 2, 9, 9, 10, 6, 3, 6,
8, 10, 0, 7, 1, 2, 8, 4, 4, 9, 5, 1, 10, 4, 1, 1, 3, 0, 3, 6, 5,
4, 0, 9, 5, 4, 6, 9, 6, 10, 6, 2, 4, 9, 2, 9, 8, 10, 0, 8, 4, 9,
5, 5, 9, 7, 2, 7, 4, 0, 2, 0, 10, 2, 4, 10, 1}] // timeIt
]
with timeIt defined to accurately time even short runs as follows:
timeIt::usage = "timeIt[expr] gives the time taken to execute expr,
repeating as many times as necessary to achieve a total time of \
1s";
SetAttributes[timeIt, HoldAll]
timeIt[expr_] := Module[{t = Timing[expr;][[1]], tries = 1},
While[t < 1.,
tries *= 2;
t = Timing[Do[expr, {tries}];][[1]];
];
Return[t/tries]]
and then
out = {#, test[#]} & /# {10, 100, 1000, 10000, 100000, 100000};
ListLogLogPlot#out
(also for larger runs). So it seems constant time here.
Suppose you enter your information not like
f[{1,0,0,0}] = 1
f[{0,1,0,0}] = 2
but into a n1 x n2 x n3 x n4 matrix m like
m[[2,1,1,1]] = 1
m[[1,2,1,1]] = 2
etc.
(you could even enter values not as f[{1,0,0,0}]=1, but as f[{1,0,0,0},1] with
f[li_List, i_Integer] := Part[m, Apply[Sequence, li + 1]] = i;
f[li_List] := Part[m, Apply[Sequence, li + 1]];
where you have to initialize m e.g. by m = ConstantArray[0, {4, 4, 4, 4}];)
Let's compare timings:
testf[z_] :=
(
Do[ f[{n1, n2, n3, n4}] = RandomInteger[{1,100}], {n1,z}, {n2,z}, {n3,z},{n4,z}];
First[ Timing[ Do[ f[{n2, n4, n1, n3}], {n1, z}, {n2, z}, {n3, z}, {n4, z} ] ] ]
);
Framed[
ListLinePlot[
Table[{z, testf[z]}, {z, 22, 36, 2}],
PlotLabel -> Row[{"DownValue approach: ",
Round[MemoryInUse[]/1024.^2],
" MB needed"
}],
AxesLabel -> {"n1,n2,n3,n4", "time/s"},ImageSize -> 500
]
]
Clear[f];
testf2[z_] :=
(
m = RandomInteger[{1, 100}, {z, z, z, z}];
f2[ni__Integer] := m[[Sequence ## ({ni} + 1)]];
First[ Timing[ Do[ f2[{n2, n4, n1, n3}], {n1, z}, {n2, z}, {n3, z}, {n4, z}] ] ]
)
Framed[
ListLinePlot[
Table[{z, testf2[z]}, {z, 22, 36, 2}],
PlotLabel -> Row[{"Matrix approach: ",
Round[MemoryInUse[]/1024.^2],
" MB needed"
}],
AxesLabel -> {"n1,n2,n3,n4", "time/s"}, ImageSize -> 500
]
]
gives
So for larger sets up information a matrix approach seems clearly preferrable.
Of course, if you have truly large data, say more GB than you have RAM, then you just
have to use a database and DatabaseLink.

Collatz Conjecture in Mathematica

I am new to Mathematica and am trying to understand patterns and rules. So I tried the following:
A = {1, 2, 3, 4}
A //. {x_?EvenQ -> x/2, x_?OddQ -> 3 x + 1}
This is based on: http://en.wikipedia.org/wiki/Collatz_conjecture
This is supposed to converge, but what I got is:
ReplaceRepeated::rrlim: Exiting after {1,2,3,4} scanned 65536 times. >>
Please help me understand my error in the pattern/rule.
Regards
The way you wrote this, it does not terminate, so it eg ends up alternating between 1 and 4, 2 etc. (all recursive descriptions must eventually bottom out somewhere, and your does not include a case to do that at n=1).
This works:
ClearAll[collatz];
collatz[1] = 1;
collatz[n_ /; EvenQ[n]] := collatz[n/2]
collatz[n_ /; OddQ[n]] := collatz[3 n + 1]
although it does not give a list of the intermediate results. A convenient way to get them is
ClearAll[collatz];
collatz[1] = 1;
collatz[n_ /; EvenQ[n]] := (Sow[n]; collatz[n/2])
collatz[n_ /; OddQ[n]] := (Sow[n]; collatz[3 n + 1])
runcoll[n_] := Last#Last#Reap[collatz[n]]
runcoll[115]
(*
-> {115, 346, 173, 520, 260, 130, 65, 196, 98, 49, 148, 74, 37, 112, 56,
28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1}
*)
or
colSeq[x_] := NestWhileList[
Which[
EvenQ[#], #/2,
True, 3*# + 1] &,
x,
# \[NotEqual] 1 &]
so that eg
colSeq[115]
(*
-> {115, 346, 173, 520, 260, 130, 65, 196, 98, 49, 148, 74, 37, 112, 56,
28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1}
*)
By the way the fastest approach I could come up with (I think I needed it for some project Euler problem) was something like
Clear#collatz;
collatz[1] := {1}
collatz[n_] := collatz[n] = If[
EvenQ[n] && n > 0,
{n}~Join~collatz[n/2],
{n}~Join~collatz[3*n + 1]]
compare:
colSeq /# Range[20000]; // Timing
(*
-> {6.87047, Null}
*)
while
Block[{$RecursionLimit = \[Infinity]},
collatz /# Range[20000];] // Timing
(*
-> {0.54443, Null}
*)
(we need to increase the recursion limit to get this to run correctly).
You got the recursive cases right, but you have no base case to terminate the recursion which leads to infinite recursion (or until Mathematica hits the pattern replacement limit). If you stop when you reach 1, it works as expected:
In[1]:= A = {1,2,3,4}
Out[1]= {1,2,3,4}
In[2]:= A //. {x_?EvenQ /; x>1 -> x/2, x_?OddQ /; x>1 -> 3 x+1}
Out[2]= {1,1,1,1}
In the documentation center, the section about writing packages is illustrated with a Collatz function example.

Recursive function definition in Mathematica

Mathematica can solve recursive equations using RSolve. Is it possible to have a function defined by a recurrence, regardless whether the recurrence can or cannot be solved analytically?
Yes. Look at RecurrenceTable. One also can program to define a function by its recurrence equation, factorial being the simplest example.
In[94]:= fac[1] = 1;
fac[k_Integer?Positive] := k*fac[k - 1]
In[96]:= fac[10]
Out[96]= 3628800
In[97]:= Function[If[#1 == 1, 1, #1*#0[#1 - 1]]][10]
Out[97]= 3628800
In[100]:= RecurrenceTable[
f[k] == k f[k - 1] && f[1] == 1, f, {k, 1, 10}]
Out[100]= {1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800}
I wondered for a moment what RecurrenceTable is good for, until I rewrote Sasha's example using NestList:
Rest#NestList[{1, 0} + First## {1, Last##} &, {1, 1}, 10][[All, -1]]
{1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800}
If the involvement of k (First##) is complicated, RecurrenceTable could be far simpler.

A variation of IntegerPartition?

IntegerPartitions[n, {3, 10}, Prime ~Array~ 10]
In Mathematica this will give a list of all the ways to get n as the sum of from three to ten of the first ten prime numbers, allowing duplicates as needed.
How can I efficiently find the sums that equal n, allowing each element to only be used once?
Using the first ten primes is only a toy example. I seek a solution that is valid for arbitrary arguments. In actual cases, generating all possible sums, even using polynomial coefficients, takes too much memory.
I forgot to include that I am using Mathematica 7.
The following will build a binary tree, and then analyze it and extract the results:
Clear[intParts];
intParts[num_, elems_List] /; Total[elems] < num := p[];
intParts[num_, {fst_, rest___}] /;
fst < num := {p[fst, intParts[num - fst, {rest}]], intParts[num, {rest}]};
intParts[num_, {fst_, rest___}] /; fst > num := intParts[num, {rest}];
intParts[num_, {num_, rest___}] := {pf[num], intParts[num, {rest}]};
Clear[nextPosition];
nextPosition =
Compile[{{pos, _Integer, 1}},
Module[{ctr = 0, len = Length[pos]},
While[ctr < len && pos[[len - ctr]] == 1, ++ctr];
While[ctr < len && pos[[len - ctr]] == 2, ++ctr];
Append[Drop[pos, -ctr], 1]], CompilationTarget -> "C"];
Clear[getPartitionsFromTree, getPartitions];
getPartitionsFromTree[tree_] :=
Map[Extract[tree, #[[;; -3]] &#FixedPointList[nextPosition, #]] &,
Position[tree, _pf, Infinity]] /. pf[x_] :> x;
getPartitions[num_, elems_List] :=
getPartitionsFromTree#intParts[num, Reverse#Sort[elems]];
For example,
In[14]:= getPartitions[200,Prime~Array~150]//Short//Timing
Out[14]= {0.5,{{3,197},{7,193},{2,5,193},<<4655>>,{3,7,11,13,17,19,23,29,37,41},
{2,3,5,11,13,17,19,23,29,37,41}}}
This is not insanely fast, and perhaps the algorithm could be optimized further, but at least the number of partitions does not grow as fast as for IntegerPartitions.
Edit:
It is interesting that simple memoization speeds the solution up about twice on the example I used before:
Clear[intParts];
intParts[num_, elems_List] /; Total[elems] < num := p[];
intParts[num_, seq : {fst_, rest___}] /; fst < num :=
intParts[num, seq] = {p[fst, intParts[num - fst, {rest}]],
intParts[num, {rest}]};
intParts[num_, seq : {fst_, rest___}] /; fst > num :=
intParts[num, seq] = intParts[num, {rest}];
intParts[num_, seq : {num_, rest___}] :=
intParts[num, seq] = {pf[num], intParts[num, {rest}]};
Now,
In[118]:= getPartitions[200, Prime~Array~150] // Length // Timing
Out[118]= {0.219, 4660}
Can use Solve over Integers, with multipliers constrained between 0 and 1. I'll show for a specific example (first 10 primes, add to 100) but it is easy to make a general procedure for this.
primeset = Prime[Range[10]];
mults = Array[x, Length[primeset]];
constraints01 = Map[0 <= # <= 1 &, mults];
target = 100;
Timing[res = mults /.
Solve[Flatten[{mults.primeset == target, constraints01}],
mults, Integers];
Map[Pick[primeset, #, 1] &, res]
]
Out[178]= {0.004, {{7, 11, 13, 17, 23, 29},
{5, 11, 13, 19, 23, 29}, {5, 7, 17, 19, 23, 29},
{2, 5, 11, 13, 17, 23, 29}, {2, 3, 11, 13, 19, 23, 29},
{2, 3, 7, 17, 19, 23, 29}, {2, 3, 5, 7, 11, 13, 17, 19, 23}}}
---edit---
To do this in version 7 one would use Reduce instead of Solve. I'll bundle this in one function.
knapsack[target_, items_] := Module[
{newset, x, mults, res},
newset = Select[items, # <= target &];
mults = Array[x, Length[newset]];
res = mults /.
{ToRules[Reduce[
Flatten[{mults.newset == target, Map[0 <= # <= 1 &, mults]}],
mults, Integers]]};
Map[Pick[newset, #, 1] &, res]]
Here is Leonid Shifrin's example:
Timing[Length[knapsack[200, Prime[Range[150]]]]]
Out[128]= {1.80373, 4660}
Not as fast as the tree code, but still (I think) reasonable behavior. At least, not obviously unreasonable.
---end edit---
Daniel Lichtblau
Wolfram Research
I would like to propose a solution, similar in spirit to Leonid's but shorter and less memory intensive. Instead of building the tree and post-processing it, the code walks the tree and Sows the solution when found:
Clear[UniqueIntegerParitions];
UniqueIntegerParitions[num_Integer?Positive,
list : {__Integer?Positive}] :=
Block[{f, $RecursionLimit = Infinity},
f[n_, cv_, {n_, r___}] :=
(Sow[Flatten[{cv, n}]]; f[n, cv, {r}];);
f[n_, cv_, {m_, r___}] /; m > n := f[n, cv, {r}];
f[n_, cv_, {m_, r___}] /;
Total[{r}] >= n - m := (f[n - m, {cv, m}, {r}]; f[n, cv, {r}]);
f[___] := Null;
Part[Reap[f[num, {}, Reverse#Union[Cases[list, x_ /; x <= num]]]],
2, 1]]
This code is slower than Leonid's
In[177]:=
UniqueIntegerParitions[200, Prime~Array~PrimePi[200]] //
Length // Timing
Out[177]= {0.499, 4660}
but uses about >~ 6 times less memory, thus allowing to go further.

Resources