Find incremental x amount of numbers in range - algorithm

I don't even know how to explain this... I've been looking for algos but no luck.
I need a function that would return an array of incrementally bigger numbers (not sure what kind of curve) from two numbers that I'd pass as parameters.
Ex.:
$length = 20;
get_numbers(1, 1000, $length);
> 1, 2, 3, 5, 10, 20, 30, 50, 100, 200, 500... // let's say that these are 20 numbers that add up to 1000
Any idea how I could do this..? I guess I'm not smart enough to figure it out.

How about an exponential curve? Sample Python implementation:
begin = 1
end = 1000
diff = end - begin
length = 10
X = diff**(1.0/(length-1))
seq = []
for i in range(length):
seq.append(int(begin+X**i))
print seq
(note: ** is the Python operator for exponentiation. Other languages may or may not use ^ instead)
Result:
[2, 3, 5, 10, 22, 47, 100, 216, 464, 999]

Related

how do i reverse individual (and specific) columns in a 2d array (RUBY)

RUBY the goal is to get the max value from each of the four zones and get their sum.
UPDATE I came up with a solution, I'm sorry about the mixup. It turned out that the matrix is a 2n x 2n matrix so it could have been greater or smaller than 4x4 in fact it. The solution i wrote below worked on all of the test cases. Here is a link to the problem
I tried doing matrix.transpose then I tried reversing the specific array, that didn't work for all edge cases.
Here is the code for that
def flippingMatrix(matrix)
2.times do
matrix = matrix.transpose
matrix = matrix.map do |array|
if (array[-1] == array.max) || (array[-2] == array.max)
array.reverse
else
array
end
end
end
return matrix[0][0] + matrix[0][1] + matrix[1][0] + matrix[1][1]
end
I gave up and tried the below, which in my mind works, it also works for most edge cases but not all.
But i'm getting an error (undefined method `[]' for nil:NilClass (NoMethodError))
keep in mind when I print the results or spot_1, spot_2, spot_3, or spot_4 I get the correct answer. does anyone have an idea why this is happening?
Here is a matrix that FAILED
[
[517, 37, 380, 3727],
[3049, 1181, 2690, 1587],
[3227, 3500, 2665, 383],
[4041, 2013, 384, 965]
]
**expected output: 12,881 (this fails)**
**because 4041 + 2013 + 3227 + 3500 = 12,881**
Here is a matrix that PASSED
[
[112, 42, 83, 119],
[56, 125, 56, 49],
[15, 78, 101, 43],
[62, 98, 114, 108],
]
**expected output: 414 (this passes)**
Here is the code
def flippingMatrix(matrix)
# Write your code here
spot_1 = [matrix[0][0] , matrix[0][3] , matrix[3][0] , matrix[3][3]].max
spot_2 = [matrix[0][1] , matrix[0][2] , matrix[3][1] , matrix[3][2]].max
spot_3 = [matrix[1][0] , matrix[1][3] , matrix[2][0] , matrix[2][3]].max
spot_4 = [matrix[1][1] , matrix[1][2] , matrix[2][1] , matrix[2][2]].max
return (spot_1 + spot_2 + spot_3 + spot_4)
end
I will answer your question and at the same time suggest two other ways to obtain the desired sum.
Suppose
arr = [
[ 1, 30, 40, 2],
[300, 4000, 1000, 200],
[400, 3000, 2000, 100],
[ 4, 10, 20, 3]
]
First solution
We see that the desired return value is 4444. This corresponds to
A B B A
C D D C
C D D C
A B B A
First create three helper methods.
Compute the largest value among the four inner elements
def mx(arr)
[arr[1][1], arr[1][2], arr[2][1], arr[2][2]].max
end
mx(arr)
#=> 4000
This is the largest of the "D" values.
Reverse the first two and last two rows
def row_flip(arr)
[arr[1], arr[0], arr[3], arr[2]]
end
row_flip(arr)
#=> [[300, 4000, 1000, 200],
# [ 1, 30, 40, 2],
# [ 4, 10, 20, 3],
# [400, 3000, 2000, 100]]
This allows us to use the method mx to obtain the largest of the "B" values.
Reverse the first two and last two columns
def column_flip(arr)
row_flip(arr.transpose).transpose
end
column_flip(arr)
#= [[ 30, 1, 2, 40],
# [4000, 300, 200, 1000],
# [3000, 400, 100, 2000],
# [ 10, 4, 3, 20]]
This allows us to use the method mx to obtain the largest of the "C" values.
Lastly, the maximum of the "A" values can be computed as follows.
t = row_flip(column_flip(arr))
#=> [[4000, 300, 200, 1000],
# [ 30, 1, 2, 40],
# [ 10, 4, 3, 20],
# [3000, 400, 100, 2000]]
mx(column_flip(t))
#=> 4
The sum of the maximum values may therefore be computed as follows.
def sum_max(arr)
t = column_flip(arr)
mx(arr) + mx(row_flip(arr)) + mx(t) + mx(row_flip(t))
end
sum_max(arr)
#=> 4444
Second solution
Another way is the following:
[0, 1].product([0, 1]).sum do |i, j|
[arr[i][j], arr[i][-j-1], arr[-i-1][j], arr[-i-1][-j-1]].max
end
#=> 4444
To see how this works let me break this into two statements add a puts statement. Note that, for each of the groups A, B, C and D, the block variables i and j are the row and column indices of the top-left element of the group.
top_left_indices = [0, 1].product([0, 1])
#=> [[0, 0], [0, 1], [1, 0], [1, 1]]
top_left_indices.sum do |i, j|
a = [arr[i][j], arr[i][-j-1], arr[-i-1][j], arr[-i-1][-j-1]]
puts a
a.max
end
#=> 4444
The prints the following.
[1, 2, 4, 3]
[30, 40, 10, 20]
[300, 200, 400, 100]
[4000, 1000, 3000, 2000]
ahhh I came up with an answer that answers all edge cases. I originally saw something like this in Javascript and kind of turned it into Ruby. Apparently some of the hidden edge cases (that were hidden) weren't all 4 by 4 some were smaller and some larger, that was the cause of the nil error.
Here is the solution:
def flippingMatrix(matrix)
total = []
(0...(matrix.length/2)).each do |idx1|
(0...(matrix.length/2)).each do |idx2|
total << [matrix[idx1][idx2],
matrix[(matrix.length - 1)-idx1][idx2],
matrix[idx1][(matrix.length - 1)-idx2],
matrix[(matrix.length - 1)-idx1][(matrix.length - 1)-idx2]].max
end
end
total.sum
end
Thank you all for your support! I hope this helps someone in the near future.

Algorithm to efficiently select rows from a matrix such that column totals are equal

The practical application of this problem is group assignment in a psychology study, but the theoretical formulation is this:
I have a matrix (the actual matrix is 27x72, but I'll pick a 4x8 as an example):
1 0 1 0
0 1 0 1
1 1 0 0
0 1 1 0
0 0 1 1
1 0 1 0
1 1 0 0
0 1 0 1
I want to pick half of the rows out of this matrix such that the column totals are equal (thus effectively creating two matrices with equivalent column totals). I cannot rearrange values within the rows.
I have tried some brute force solutions, but my matrix is too large for that to be effective, even having chosen some random restrictions first. It seems to me that the search space could be constrained with a better algorithm, but I haven't been able to think of one thus far. Any ideas? It is also possible that there is no solution, so an algorithm would have to be able to deal with that. I have been working in R, but I could switch to python easily.
Update
Found a solution thanks to ljeabmreosn. Karmarkar-Karp worked great for an algorithm, and converting the rows to base 73 was inspired. I had a surprising hard time finding code that would actually give me the sub-sequences rather than just the final difference (maybe most people are only interested in this problem in the abstract?). Anyway this was the code:
First I converted my rows in to base 73 as the poster suggested. To do this I used the basein package in python, defining an alphabet with 73 characters and then using the basein.decode function to convert to decimel.
For the algorithm, I just added code to print the sub-sequence indices from this mailing list message from Tim Peters: https://mail.python.org/pipermail/tutor/2001-August/008098.html
from __future__ import nested_scopes
import sys
import bisect
class _Num:
def __init__(self, value, index):
self.value = value
self.i = index
def __lt__(self, other):
return self.value < other.value
# This implements the Karmarkar-Karp heuristic for partitioning a set
# in two, i.e. into two disjoint subsets s.t. their sums are
# approximately equal. It produces only one result, in O(N*log N)
# time. A remarkable property is that it loves large sets: in
# general, the more numbers you feed it, the better it does.
class Partition:
def __init__(self, nums):
self.nums = nums
sorted = [_Num(nums[i], i) for i in range(len(nums))]
sorted.sort()
self.sorted = sorted
def run(self):
sorted = self.sorted[:]
N = len(sorted)
connections = [[] for i in range(N)]
while len(sorted) > 1:
bigger = sorted.pop()
smaller = sorted.pop()
# Force these into different sets, by "drawing a
# line" connecting them.
i, j = bigger.i, smaller.i
connections[i].append(j)
connections[j].append(i)
diff = bigger.value - smaller.value
assert diff >= 0
bisect.insort(sorted, _Num(diff, i))
# Now sorted contains only 1 element x, and x.value is
# the difference between the subsets' sums.
# Theorem: The connections matrix represents a spanning tree
# on the set of index nodes, and any tree can be 2-colored.
# 2-color this one (with "colors" 0 and 1).
index2color = [None] * N
def color(i, c):
if index2color[i] is not None:
assert index2color[i] == c
return
index2color[i] = c
for j in connections[i]:
color(j, 1-c)
color(0, 0)
# Partition the indices by their colors.
subsets = [[], []]
for i in range(N):
subsets[index2color[i]].append(i)
return subsets
if not sys.argv:
print "error no arguments provided"
elif sys.argv[1]:
f = open(sys.argv[1], "r")
x = [int(line.strip()) for line in f]
N = 50
import math
p = Partition(x)
s, t = p.run()
sum1 = 0L
sum2 = 0L
for i in s:
sum1 += x[i]
for i in t:
sum2 += x[i]
print "Set 1:"
print s
print "Set 2:"
print t
print "Set 1 sum", repr(sum1)
print "Set 2 sum", repr(sum2)
print "difference", repr(abs(sum1 - sum2))
This gives the following output:
Set 1:
[0, 3, 5, 6, 9, 10, 12, 15, 17, 19, 21, 22, 24, 26, 28, 31, 32, 34, 36, 38, 41, 43, 45, 47, 48, 51, 53, 54, 56, 59, 61, 62, 65, 66, 68, 71]
Set 2:
[1, 2, 4, 7, 8, 11, 13, 14, 16, 18, 20, 23, 25, 27, 29, 30, 33, 35, 37, 39, 40, 42, 44, 46, 49, 50, 52, 55, 57, 58, 60, 63, 64, 67, 69, 70]
Set 1 sum 30309344369339288555041174435706422018348623853211009172L
Set 2 sum 30309344369339288555041174435706422018348623853211009172L
difference 0L
Which provides the indices of the proper subsets in a few seconds. Thanks everybody!
Assuming each entry in the matrix can either be 0 or 1, this problem seems to be in the same family as the Partition Problem which only has a pseudo-polynomial time algorithm. Let r be the number of rows in the matrix and c be the number of columns in the matrix. Then, encode each row to a c-digit number of base r+1. This is to ensure when adding each encoding, there is no need to carry, thus equivalent numbers in this base will equate to two sets of rows whose column sums are equivalent. So in your example, you would convert each row into a 4-digit number of base 9. This would yield the numbers (converted into base 10):
10109 => 73810
01019 => 8210
11009 => 81010
01109 => 9010
00119 => 1010
10109 => 73810
11009 => 81010
01019 => 8210
Although you probably couldn't use the pseudo-polynomial time algorithm with this method, you could use a simple heuristic with some decision trees to try to speed up the bruteforce. Using the numbers above, you could try to use the Karmarkar-Karp heuristic. Implemented below is the first step of algorithm in Python 3:
# Sorted (descending) => 810, 810, 738, 738, 90, 82, 82, 10
from queue import PriorityQueue
def karmarkar_karp_partition(arr):
pqueue = PriorityQueue()
for e in arr:
pqueue.put_nowait((-e, e))
for _ in range(len(arr)-1):
_, first = pqueue.get_nowait()
_, second = pqueue.get_nowait()
diff = first - second
pqueue.put_nowait((-diff, diff))
return pqueue.get_nowait()[1]
Here is the algorithm fully implemented. Note that this method is simply a heuristic and may fail to find the best partition.

Generating a subset uniformly at random?

Here is an implementation of a combinatorial algorithm to choose a subset of an n-set, uniformly at random. Since there are 2n subsets of an n-set, each subset should have a probability: 2-n of getting selected.
I believe I have implemented the algorithm correctly (please let me know if there is a bug somewhere). When I run the program with Java 7 on my Linux box however, I get results that I am not able to reason quite well. The mystery seems to be around the Random Number Generator. I understand that one needs to run the program a 'large number' of times to 'see that the distribution reaches uniformity'. The question however is how large is large. A few runs I did suggest that unless the number of times the experiment is done is >= 1 billion, the distribution of chosen subsets is quite nonuniform.
The algorithm is based on Prof. Herbert Wilf's combinatorial algorithms book where the implementation (slightly different) is done in Fortran and the distribution is more-or-less uniform even when the program is run only 1280 times.
Here are a few sample runs (there's some variation among the run when n is constant) to get a random subset of a 4-set:
Number of times experiment is done n = 1280
Number of times experiment is done n = 12,800
Number of times experiment is done n = 128,000 (still 8 subsets only!)
Number of times experiment is done n = 1,280,000
Number of times experiment is done n = 12,800,000 (now it starts making sense)
Number of times experiment is done n = 1,280,000,000 (this is okay!)
Would you expect such performance? How could Prof. Wilf achieve similar results with only 1280 iterations of an equivalent program?
Every time you call ranInt(), you reset the RNG. Therefore in the long run, these numbers are no longer random.
Moved Random r = new Random(System.currentTimeMillis()); to the top and add static to it
class RandomSubsetSimulation {
static Random r = new Random(System.currentTimeMillis());
public static void main(String[] args) { ...
I am able to get the following results with 8-set
Total: 1000, number of subsets with a frequency > 0: 256
Total # of subsets possible: 256
Full results with 4-set
Frequencies of chosen subsets ....
[3] : 76, 4, 5.94
[4] : 72, 8, 5.63
[] : 83, -3, 6.48
[1] : 90, -10, 7.03
[2] : 80, 0, 6.25
[3, 4] : 86, -6, 6.72
[2, 3] : 88, -8, 6.88
[2, 4] : 55, 25, 4.30
[1, 2, 3] : 99, -19, 7.73
[1, 2, 4] : 75, 5, 5.86
[2, 3, 4] : 76, 4, 5.94
[1, 3] : 85, -5, 6.64
[1, 2] : 94, -14, 7.34
[1, 4] : 72, 8, 5.63
[1, 2, 3, 4] : 71, 9, 5.55
[1, 3, 4] : 78, 2, 6.09
Total: 1280, number of subsets with a frequency > 0: 16
Total # of subsets possible: 16

How to find out which subtotal make up a sum?

I need to find numbers in a list, which make up a specific total:
Sum: 500
Subtotals: 10 490 20 5 5
In the end I need: {10 490, 490 5 5}
How do you call this type of problem? Are there algorithms to solve it efficiently?
This is Knapsack problem and it is an NP-complete problem, i.e. there is no efficient algorithm known for it.
This is not a knapsack problem.
In the worst case, with N subtotals, there can be O(2^N) solutions, so any algorithm in worst-case will be no better than this (thus, the problem doesn't belong to NP class at all).
Let's assume there are no non-positive elements in the Subtotals array and any element is no greater than Sum. We can sort array of subtotals, then build array of tail sums, adding 0 to the end. In your example, it will look like:
Subtotals: (490, 20, 10, 5, 5)
PartialSums: (530, 40, 20, 10, 5, 0)
Now for any "remaining sum" S, position i, and "current list" L we have problem E(S, i, L):
E(0, i, L) = (print L).
E(S, i, L) | (PartialSums[i] < S) = (nothing).
E(S, i, L) = E(S, i+1, L), E(S-Subtotals[i], j, L||Subtotals[i]), where j is index of first element of Subtotals lesser than or equal to (S-Subtotals[i]) or i+1, whichever is greater.
Our problem is E(Sum, 0, {}).
Of course, there's a problem with duplicates (if there were another 490 number in your list, this algorithm would output 4 solutions). If that's not what you need, using array of pairs (value, multiplicity) may help.
P.S. You may also consider dynamic programming if size of the problem is small enough:
Start with set {0}. Create array of sets equal to array of subtotals in size.
For every subtotal create a new set from previous set by adding subtotal value. Remove all elements greater than Sum. Merge it with previous set (it will essentially be the set of all possible sums).
If in the final set doesn't have Sum, then there is no solution. Otherwise, you backtrack solution from Sum to 0, checking whether previous set contains [value] and [value-subtotal].
Example:
(10, 490, 20, 5, 5)
Sets:
(0)
(0, 10)
(0, 10, 490, 500)
(0, 10, 20, 30, 490, 500) (510, 520 - discarded)
(0, 5, 10, 15, 20, 25, 30, 35, 490, 495, 500)
(0, 5, 10, 15, 20, 25, 30, 35, 40, 490, 495, 500)
From last set: [500-5] in previous set, [495-5] in previous set, [490-20] not in previous set ([490] is), [490-490] is 0, resulting answer {5, 5, 490}.

What is the best way to find the period of a (repeating) list in Mathematica?

What is the best way to find the period in a repeating list?
For example:
a = {4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2}
has repeat {4, 5, 1, 2, 3} with the remainder {4, 5, 1, 2} matching, but being incomplete.
The algorithm should be fast enough to handle longer cases, like so:
b = RandomInteger[10000, {100}];
a = Join[b, b, b, b, Take[b, 27]]
The algorithm should return $Failed if there is no repeating pattern like above.
Please see the comments interspersed with the code on how it works.
(* True if a has period p *)
testPeriod[p_, a_] := Drop[a, p] === Drop[a, -p]
(* are all the list elements the same? *)
homogeneousQ[list_List] := Length#Tally[list] === 1
homogeneousQ[{}] := Throw[$Failed] (* yes, it's ugly to put this here ... *)
(* auxiliary for findPeriodOfFirstElement[] *)
reduce[a_] := Differences#Flatten#Position[a, First[a], {1}]
(* the first element occurs every ?th position ? *)
findPeriodOfFirstElement[a_] := Module[{nl},
nl = NestWhileList[reduce, reduce[a], ! homogeneousQ[#] &];
Fold[Total#Take[#2, #1] &, 1, Reverse[nl]]
]
(* the period must be a multiple of the period of the first element *)
period[a_] := Catch#With[{fp = findPeriodOfFirstElement[a]},
Do[
If[testPeriod[p, a], Return[p]],
{p, fp, Quotient[Length[a], 2], fp}
]
]
Please ask if findPeriodOfFirstElement[] is not clear. I did this independently (for fun!), but now I see that the principle is the same as in Verbeia's solution, except the problem pointed out by Brett is fixed.
I was testing with
b = RandomInteger[100, {1000}];
a = Flatten[{ConstantArray[b, 1000], Take[b, 27]}];
(Note the low integer values: there will be lots of repeating elements within the same period *)
EDIT: According to Leonid's comment below, another 2-3x speedup (~2.4x on my machine) is possible by using a custom position function, compiled specifically for lists of integers:
(* Leonid's reduce[] *)
myPosition = Compile[
{{lst, _Integer, 1}, {val, _Integer}},
Module[{pos = Table[0, {Length[lst]}], i = 1, ctr = 0},
For[i = 1, i <= Length[lst], i++,
If[lst[[i]] == val, pos[[++ctr]] = i]
];
Take[pos, ctr]
],
CompilationTarget -> "C", RuntimeOptions -> "Speed"
]
reduce[a_] := Differences#myPosition[a, First[a]]
Compiling testPeriod gives a further ~20% speedup in a quick test, but I believe this will depend on the input data:
Clear[testPeriod]
testPeriod =
Compile[{{p, _Integer}, {a, _Integer, 1}},
Drop[a, p] === Drop[a, -p]]
Above methods are better if you have no noise. If your signal is only approximate then Fourier transform methods might be useful. I'll illustrate with a "parametrized" setup wherein the length and number of repetitions of the base signal, the length of the trailing part, and a bound on the noise perturbation are all variables one can play with.
noise = 20;
extra = 40;
baselen = 103;
base = RandomInteger[10000, {baselen}];
repeat = 5;
signal = Flatten[Join[ConstantArray[base, repeat], Take[base, extra]]];
noisysignal = signal + RandomInteger[{-noise, noise}, Length[signal]];
We compute the absolute value of the FFT. We adjoin zeros to both ends. The object will be to threshold by comparing to neighbors.
sigfft = Join[{0.}, Abs[Fourier[noisysignal]], {0}];
Now we create two 0-1 vectors. In one we threshold by making a 1 for each element in the fft that is greater than twice the geometric mean of its two neighbors. In the other we use the average (arithmetic mean) but we lower the size bound to 3/4. This was based on some experimentation. We count the number of 1s in each case. Ideally we'd get 100 for each, as that would be the number of nonzeros in a "perfect" case of no noise and no tail part.
In[419]:=
thresh1 =
Table[If[sigfft[[j]]^2 > 2*sigfft[[j - 1]]*sigfft[[j + 1]], 1,
0], {j, 2, Length[sigfft] - 1}];
count1 = Count[thresh1, 1]
thresh2 =
Table[If[sigfft[[j]] > 3/4*(sigfft[[j - 1]] + sigfft[[j + 1]]), 1,
0], {j, 2, Length[sigfft] - 1}];
count2 = Count[thresh2, 1]
Out[420]= 114
Out[422]= 100
Now we get our best guess as to the value of "repeats", by taking the floor of the total length over the average of our counts.
approxrepeats = Floor[2*Length[signal]/(count1 + count2)]
Out[423]= 5
So we have found that the basic signal is repeated 5 times. That can give a start toward refining to estimate the correct length (baselen, above). To that end we might try removing elements at the end and seeing when we get ffts closer to actually having runs of four 0s between nonzero values.
Something else that might work for estimating number of repeats is finding the modal number of zeros in run length encoding of the thresholded ffts. While I have not actually tried that, it looks like it might be robust to bad choices in the details of how one does the thresholding (mine were just experiments that seem to work).
Daniel Lichtblau
The following assumes that the cycle starts on the first element and gives the period length and the cycle.
findCyclingList[a_?VectorQ] :=
Module[{repeats1, repeats2, cl, cLs, vec},
repeats1 = Flatten#Differences[Position[a, First[a]]];
repeats2 = Flatten[Position[repeats1, First[repeats1]]];
If[Equal ## Differences[repeats2] && Length[repeats2] > 2(*
is potentially cyclic - first element appears cyclically *),
cl = Plus ### Partition[repeats1, First[Differences[repeats2]]];
cLs = Partition[a, First[cl]];
If[SameQ ## cLs (* candidate cycles all actually the same *),
vec = First[cLs];
{Length[vec], vec}, $Failed], $Failed] ]
Testing
b = RandomInteger[50, {100}];
a = Join[b, b, b, b, Take[b, 27]];
findCyclingList[a]
{100, {47, 15, 42, 10, 14, 29, 12, 29, 11, 37, 6, 19, 14, 50, 4, 38,
23, 3, 41, 39, 41, 17, 32, 8, 18, 37, 5, 45, 38, 8, 39, 9, 26, 33,
40, 50, 0, 45, 1, 48, 32, 37, 15, 37, 49, 16, 27, 36, 11, 16, 4, 28,
31, 46, 30, 24, 30, 3, 32, 31, 31, 0, 32, 35, 47, 44, 7, 21, 1, 22,
43, 13, 44, 35, 29, 38, 31, 31, 17, 37, 49, 22, 15, 28, 21, 8, 31,
42, 26, 33, 1, 47, 26, 1, 37, 22, 40, 27, 27, 16}}
b1 = RandomInteger[10000, {100}];
a1 = Join[b1, b1, b1, b1, Take[b1, 23]];
findCyclingList[a1]
{100, {1281, 5325, 8435, 7505, 1355, 857, 2597, 8807, 1095, 4203,
3718, 3501, 7054, 4620, 6359, 1624, 6115, 8567, 4030, 5029, 6515,
5921, 4875, 2677, 6776, 2468, 7983, 4750, 7609, 9471, 1328, 7830,
2241, 4859, 9289, 6294, 7259, 4693, 7188, 2038, 3994, 1907, 2389,
6622, 4758, 3171, 1746, 2254, 556, 3010, 1814, 4782, 3849, 6695,
4316, 1548, 3824, 5094, 8161, 8423, 8765, 1134, 7442, 8218, 5429,
7255, 4131, 9474, 6016, 2438, 403, 6783, 4217, 7452, 2418, 9744,
6405, 8757, 9666, 4035, 7833, 2657, 7432, 3066, 9081, 9523, 3284,
3661, 1947, 3619, 2550, 4950, 1537, 2772, 5432, 6517, 6142, 9774,
1289, 6352}}
This case should fail because it isn't cyclical.
findCyclingList[Join[b, Take[b, 11], b]]
$Failed
I tried to something with Repeated, e.g. a /. Repeated[t__, {2, 100}] -> {t} but it just doesn't work for me.
Does this work for you?
period[a_] :=
Quiet[Check[
First[Cases[
Table[
{k, Equal ## Partition[a, k]},
{k, Floor[Length[a]/2]}],
{k_, True} :> k
]],
$Failed]]
Strictly speaking, this will fail for things like
a = {1, 2, 3, 1, 2, 3, 1, 2, 3, 4, 5}
although this can be fixed by using something like:
(Equal ## Partition[a, k]) && (Equal ## Partition[Reverse[a], k])
(probably computing Reverse[a] just once ahead of time.)
I propose this. It borrows from both Verbeia and Brett's answers.
Do[
If[MatchQ ## Equal ## Partition[#, i, i, 1, _], Return ## i],
{i, #[[ 2 ;; Floor[Length##/2] ]] ~Position~ First##}
] /. Null -> $Failed &
It is not quite as efficient as Vebeia's function on long periods, but it is faster on short ones, and it is simpler as well.
I don't know how to solve it in mathematica, but the following algorithm (written in python) should work. It's O(n) so speed should be no concern.
def period(array):
if len(array) == 0:
return False
else:
s = array[0]
match = False
end = 0
i = 0
for k in range(1,len(array)):
c = array[k]
if not match:
if c == s:
i = 1
match = True
end = k
else:
if not c == array[i]:
match = False
i += 1
if match:
return array[:end]
else:
return False
# False
print(period([4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2,1]))
# [4, 5, 1, 2, 3]
print(period([4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2]))
# False
print(period([4]))
# [4, 2]
print(period([4,2,4]))
# False
print(period([4,2,1]))
# False
print(period([]))
Ok, just to show my own work here:
ModifiedTortoiseHare[a_List] := Module[{counter, tortoise, hare},
Quiet[
Check[
counter = 1;
tortoise = a[[counter]];
hare = a[[2 counter]];
While[(tortoise != hare) || (a[[counter ;; 2 counter - 1]] != a[[2 counter ;; 3 counter - 1]]),
counter++;
tortoise = a[[counter]];
hare = a[[2 counter]];
];
counter,
$Failed]]]
I'm not sure this is a 100% correct, especially with cases like {pattern,pattern,different,pattern, pattern} and it gets slower and slower when there are a lot of repeating elements, like so:
{ 1,2,1,1, 1,2,1,1, 1,2,1,1, ...}
because it is making too many expensive comparisons.
#include <iostream>
#include <vector>
using namespace std;
int period(vector<int> v)
{
int p=0; // period 0
for(int i=p+1; i<v.size(); i++)
{
if(v[i] == v[0])
{
p=i; // new potential period
bool periodical=true;
for(int i=0; i<v.size()-p; i++)
{
if(v[i]!=v[i+p])
{
periodical=false;
break;
}
}
if(periodical) return p;
i=p; // try to find new period
}
}
return 0; // no period
}
int main()
{
vector<int> v3{1,2,3,1,2,3,1,2,3};
cout<<"Period is :\t"<<period(v3)<<endl;
vector<int> v0{1,2,3,1,2,3,1,9,6};
cout<<"Period is :\t"<<period(v0)<<endl;
vector<int> v1{1,2,1,1,7,1,2,1,1,7,1,2,1,1};
cout<<"Period is :\t"<<period(v1)<<endl;
return 0;
}
This sounds like it might relate to sequence alignment. These algorithms are well studied, and might already be implemented in mathematica.

Resources