XOR of numbers = X - algorithm

I found this problem in a hiring contest(which is over now). Here it is:
You are given two natural numbers N and X. You are required to create an array of N natural numbers such that the bitwise XOR of these numbers is equal to X. The sum of all the natural numbers that are available in the array is as minimum as possible.
If there exist multiple arrays, print the smallest one
Array A< Array B if
A[i] < B[i] for any index i, and A[i]=B[i] for all indices less than i
Sample Input: N=3, X=2
Sample output : 1 1 2
Explanation: We have to print 3 natural numbers having the minimum sum Thus the N-spaced numbers are [1 1 2]
My approach:
If N is odd, I put N-1 ones in the array (so that their xor is zero) and then put X
If N is even, I put N-1 ones again and then put X-1(if X is odd) and X+1(if X is even)
But this algorithm failed for most of the test cases. For example, when N=4 and X=6 my output is
1 1 1 7 but it should be 1 1 2 4
Anyone knows how to make the array sum minimum?

In order to have the minimum sum, you need to make sure that when your target is X, you are not cancelling the bits of X and recreating them again. Because this will increase the sum. For this, you have create the bits of X one by one (ideally) from the end of the array. So, as in your example of N=4 and X=6 we have: (I use ^ to show xor)
X= 7 = 110 (binary) = 2 + 4. Note that 2^4 = 6 as well because these numbers don't share any common bits. So, the output is 1 1 2 4.
So, we start by creating the most significant bits of X from the end of the output array. Then, we also have to handle the corner cases for different values of N. I'm going with a number of different examples to make the idea clear:
``
A) X=14, N=5:
X=1110=8+4+2. So, the array is 1 1 2 4 8.
B) X=14, N=6:
X=8+4+2. The array should be 1 1 1 1 2 12.
C) X=15, N=6:
X=8+4+2+1. The array should be 1 1 1 2 4 8.
D) X=15, N=5:
The array should be 1 1 1 2 12.
E) X=14, N=2:
The array should be 2 12. Because 12 = 4^8
``
So, we go as follows. We compute the number of powers of 2 in X. Let this number be k.
Case 1 - If k <= n (example E): we start by picking the smallest powers from left to right and merge the remaining on the last position in the array.
Case 2 - If k > n (example A, B, C, D): we compute h = n - k. If h is odd we put h = n-k+1. Now, we start by putting h 1's in the beginning of the array. Then, the number of places left is less than k. So, we can follow the idea of Case 1 for the remaining positions. Note that in case 2, instead of having odd number of added 1's we put and even number of 1's and then do some merging at the end. This guarantees that the array is the smallest it can be.

We have to consider that we have to minimize the sum of the array for solution and that is the key point.
First calculate set bits in N suppose if count of setbits are less than or equal to X then divide N in X integers based on set bits like
N = 15, X = 2
setbits in 15 are 4 solution is 1 14
if X = 3 solution is 1 2 12
this minimizes array sum too.
other case if setbits are greater than X
calculate difference = setbits(N) - X
If difference is even then add ones as needed and apply above algorithm all ones will cancel out.
If difference is odd then add ones but now you have take care of that 1 extra one in the answer array.
Check for the corner cases too.

Related

Find minimum no of swaps required to move all 1's together in a binary array

Eg: Array : [0,1,0,1,1,0,0]
Final Array: [0,0,1,1,1,0,0] , So swaps required = 1
i need a O(n) or O(nlogn) solution
You can do it in O(n):
In one pass through the data, determine the number of 1s. Call this k (it is just the sum of the elements in the list).
In a second pass through the data, use a sliding window of width k to find the number, m which is the maximum number of 1s in any window of size k. Since this is homework, I'll leave the details to you, but it can be done in O(n).
Then: the minimal number of swaps is k-m.
EDIT This answer assumes that only two neighboring cells can be swapped. If the distance between the two swapped elements is arbitrary, see #JohnColeman's answer.
This can be done easily in linear time.
Suppose that the array is called a and its size is n.
Allocate integer array b of size n. Walk from left to right, save in b[i] the number of ones seen so far in a[0], ..., a[i].
Allocate integer array c of size n. Walk from right to left, save in c[i] the number of ones seen so far in a[i], ..., a[N - 1].
Initialize integer res = 0. Walk through a one last time. For each i with a[i] = 0, add res += min(b[i] c[i])
Output res
Why this works? Each zero must somehow bubble out of the block of ones. So, every zero must either "bubble-up" past all ones to the right of it, or it must "bubble-down" past all ones to the left of it. Swapping zeros with zeros is waste of time, therefore the process of zero-eviction from the homogeneous block of ones must start with those zeros that are as close to the first 1 or the last 1 as possible. This means, that every zero will have to make exactly min(b[i], c[i]) swaps with 1s to exit the homogeneous block of ones.
Example:
a = [0,1,0,1,1,0,1,0,1,0,1,0]
b = [0,1,1,2,3,3,4,4,5,5,6,6]
c = [6,6,5,5,4,3,3,2,2,1,1,0]
now, min(b,c) would be (no need to compute it explicitly):
m = [0,1,1,2,3,3,3,2,2,1,1,0]
^ ^ ^ ^ ^ ^
The interesting values of min(b[i], c[i]) which correspond to 0s are marked with ^. Summing it up yields: 0 + 1 + 3 + 2 + 1 + 0 = 7.
Indeed:
[0,1,0,1,1,0,1,0,1,0,1,0]
[0,0,1,1,1,0,1,0,1,0,1,0] 1
[0,0,1,1,1,0,1,0,1,1,0,0] 2 = 1 + 1
[0,0,1,1,1,0,1,1,0,1,0,0] 3
[0,0,1,1,1,0,1,1,1,0,0,0] 4 = 1 + 1 + 2
[0,0,1,1,0,1,1,1,1,0,0,0] 5
[0,0,1,0,1,1,1,1,1,0,0,0] 6
[0,0,0,1,1,1,1,1,1,0,0,0] 7 = 1 + 1 + 2 + 3
done: block of ones homogeneous.
Runtime for computation of the number res of swaps is obviously O(n). (Note: it does NOT say that the number of swaps is itself O(n)).
Let's consider each 1 as a potential static point. Then the cost for the left side of the static point would be the number of 1's to the left subtracted by the number of 1's already in the section it would naturally extend to, the length of which is the number of 1's on the left. Similarly for the right side.
Now find a way to do it efficiently for each potential static 1 :) Hint: think about how we could update those values as we iterate across the array.
1 0 1 0 1 1 0 0 1 0 1 1
x potential static point
<----- would extend to
-----> would extend to
left cost at x: 3 - 2 = 1
right cost at x: 3 - 1 = 2

Find the highest minimum difference of a number from a given range

Say I have an array of N numbers {A1, A2, ... , An} and 2 numbers P, Q
I have to find an integer M between P and Q , such that, min {|Ai-M|, 1 ≤ i ≤ N} is maximised.
1 < N < P ≤ Q ≤ 10^6
in simpler words:
for each number, find the minimum absolute difference between this number and the array.
then out of all those minimum differences, find the number who has the highest minimum difference.
I have to do this in O(NlogN) or less.
I have tried the following:
sort the array A (NlogN)
iterate over all the numbers between P and Q and for each number find the minimum difference using modified binary search and keep track who has the highest difference - O((Q-P)logN)
I'm guessing there is some kind of math "trick" like using average I'm missing..
edit (add example):
for example if you have the array
{5 8 14}
and P=4 Q=9
the answer is 4,6,7, or 9.
lets look at the numbers 4-9
|4-5| = 1
|4-8| = 4
|4-14| = 10
so minimum diff for 4 is 1
|5-5| = 0
|5-8| = 3
|5-14| = 9
so minimum diff for 4 is 0
we keep going and find minimum diff for all the numbers and then we need to say which number (4/5/6/7/8/9) had the highest minimum diff (in this example 4,6,7 and 9 have 1 minimum difference which is max among all minimum differences)
First you have to sort an array. Then you have to notice that your solution is either P or Q or some point x[i] = (A[i] + A[i+1]) // 2. Basically x[i] is in the middle between consecutive elements in the array (if this x[i] is between P, Q).
Because N is really small, this will run basically in O(1).

In how many ways can you construct an array of size N such that the product of any pair of consecutive elements in not greater than M?

Every element is an integer and should have a value of at least 1.
Constraints: 2 ≤ N ≤ 1000 and 1 ≤ M ≤ 1000000000.
We need to find the answer modulo 1000000007
May be we can calculate dp[len][type][typeValue], where type have only two states:
type = 0: this is means, that last number in sequence with length len equal or smaller than sqrt(M). And this number we save in typeValue
type = 1: this is means, that last number in sequence bigger than sqrt(M). And we save in typeValue number k = M / lastNumber (rounded down), which not greater than sqrt(M).
So, this dp have O(N sqrt(M)) states, but how can we calculate each 'cell' of this dp?
Firstly, consider some 'cell' dp[len][0][number]. This value can calculate as follows:
dp[len][0][number] = sum[1 <= i <= sqrt(M)] (dp[len - 1][0][i]) + sum[number <= i <= sqrt(M)] (dp[len - 1][1][i])
Little explanation: beacuse type = 0 => number <= sqrt(M), so we can put any number not greater than sqrt(M) next and only some small number greater.
For the dp[len][1][number] we can use next equation:
dp[len][1][k] = sum[1 <= i <= k] (dp[len - 1][0][i] * cntInGroup(k)) where cntInGroup(k) - cnt numbers x such that M / x = k
We can simply calculate cntInGroups(k) for all 1 <= k <= sqrt(M) using binary search or formulas.
But another problem is that out algorithm needs O(sqrt(M)) operations so result asymptotic is O(N M). But we can improve that.
Note that we need to calculate sum of some values on segments, which were processed on previous step. So, we can precalculate prefix sums in advance and after that we can calculate each 'cell' of dp in O(1) time.
So, with this optimization we can solve this problem with asymptotic O(N sqrt(M))
Here is an example for N = 4, M = 10:
1 number divides 10 into 10 equal parts with a remainder less than the part
1 number divides 10 into 5 equal parts with a remainder less than the part
1 number divides 10 into 3 equal parts with a remainder less than the part
2 numbers divide 10 into 2 equal parts with a remainder less than the part
5 numbers divide 10 into 1 part with a remainder less than the part
Make an array and update it for each value of n:
N 1 1 1 2 5
----------------------
2 10 5 3 2 1 // 10 div 1 ; 10 div 2 ; 10 div 3 ; 10 div 5,4 ; 10 div 6,7,8,9,10
3 27 22 18 15 10 // 10+5+3+2*2+5*1 ; 10+5+3+2*2 ; 10+5+3 ; 10+5 ; 10
4 147 97 67 49 27 // 27+22+18+2*15+5*10 ; 27+22+18+2*15 ; 27+22+18 ; 27+22 ; 27
The solution for N = 4, M = 10 is therefore:
147 + 97 + 67 + 2*49 + 5*27 = 544
My thought process:
For each number in the first array position, respectively, there could be the
following in the second:
1 -> 1,2..10
2 -> 1,2..5
3 -> 1,2,3
4 -> 1,2
5 -> 1,2
6 -> 1
7 -> 1
8 -> 1
9 -> 1
10 -> 1
Array position 3:
For each of 10 1's in col 2, there could be 1 of 1,2..10
For each of 5 2's in col 2, there could be 1 of 1,2..5
For each of 3 3's in col 2, there could be 1 of 1,2,3
For each of 2 4's in col 2, there could be 1 of 1,2
For each of 2 5's in col 2, there could be 1 of 1,2
For each of 1 6,7..10 in col 2, there could be one 1
27 1's; 22 2's; 18 3's; 15 4's; 15 5's; 10 x 6's,7's,8's,9's,10's
Array position 4:
1's = 27+22+18+15+15+10*5
2's = 27+22+18+15+15
3's = 27+22+18
4's = 27+22
5's = 27+22
6,7..10's = 27 each
Create a graph and assign the values from 0 to M to the vertices. An edge exists between two vertices if their product is not greater than M. The number of different arrays is then the number of paths with N steps, starting at the vertex with value 0. This number can be computed using a simple depth-first search.
The question is now whether this is efficient enough and whether it can be made more efficient. One way is to restructure the solution using matrix multiplication. The matrix to multiply with represents the edges above, it has a 1 when there is an edge, a 0 otherwise. The initial matrix on the left represents the starting vertex, it has a 1 at position (0, 0), zeros everywhere else.
Based on this, you can multiply the right matrix with itself to represent two steps through the graph. This means that you can combine two steps to make them more efficient, so you only need to multiply log(N) times, not N times. However, make sure you use known efficient matrix multiplication algorithms to implement this, the naive one will only perform for small M.

Levenstein distance from particular group of numbers

My input are three numbers - a number s and the beginning b and end e of a range with 0 <= s,b,e <= 10^1000. The task is to find the minimal Levenstein distance between s and all numbers in range [b, e]. It is not necessary to find the number minimizing the distance, the minimal distance is sufficient.
Obviously I have to read the numbers as string, because standard C++ type will not handle such large numbers. Calculating the Levenstein distance for every number in the possibly huge range is not feasible.
Any ideas?
[EDIT 10/8/2013: Some cases considered in the DP algorithm actually don't need to be considered after all, though considering them does not lead to incorrectness :)]
In the following I describe an algorithm that takes O(N^2) time, where N is the largest number of digits in any of b, e, or s. Since all these numbers are limited to 1000 digits, this means at most a few million basic operations, which will take milliseconds on any modern CPU.
Suppose s has n digits. In the following, "between" means "inclusive"; I will say "strictly between" if I mean "excluding its endpoints". Indices are 1-based. x[i] means the ith digit of x, so e.g. x[1] is its first digit.
Splitting up the problem
The first thing to do is to break up the problem into a series of subproblems in which each b and e have the same number of digits. Suppose e has k >= 0 more digits than s: break up the problem into k+1 subproblems. E.g. if b = 5 and e = 14032, create the following subproblems:
b = 5, e = 9
b = 10, e = 99
b = 100, e = 999
b = 1000, e = 9999
b = 10000, e = 14032
We can solve each of these subproblems, and take the minimum solution.
The easy cases: the middle
The easy cases are the ones in the middle. Whenever e has k >= 1 more digits than b, there will be k-1 subproblems (e.g. 3 above) in which b is a power of 10 and e is the next power of 10, minus 1. Suppose b is 10^m. Notice that choosing any digit between 1 and 9, followed by any m digits between 0 and 9, produces a number x that is in the range b <= x <= e. Furthermore there are no numbers in this range that cannot be produced this way. The minimum Levenshtein distance between s (or in fact any given length-n digit string that doesn't start with a 0) and any number x in the range 10^m <= x <= 10^(m+1)-1 is necessarily abs(m+1-n), since if m+1 >= n it's possible to simply choose the first n digits of x to be the same as those in s, and delete the remainder, and if m+1 < n then choose the first m+1 to be the same as those in s and insert the remainder.
In fact we can deal with all these subproblems in a single constant-time operation: if the smallest "easy" subproblem has b = 10^m and the largest "easy" subproblem has b = 10^u, then the minimum Levenshtein distance between s and any number in any of these ranges is m-n if n < m, n-u if n > u, and 0 otherwise.
The hard cases: the end(s)
The hard cases are when b and e are not restricted to have the form b = 10^m and e = 10^(m+1)-1 respectively. Any master problem can generate at most two subproblems like this: either two "ends" (resulting from a master problem in which b and e have different numbers of digits, such as the example at the top) or a single subproblem (i.e. the master problem itself, which didn't need to be subdivided at all because b and e already have the same number of digits). Note that due to the previous splitting of the problem, we can assume that the subproblem's b and e have the same number of digits, which we will call m.
Super-Levenshtein!
What we will do is design a variation of the Levenshtein DP matrix that calculates the minimum Levenshtein distance between a given digit string (s) and any number x in the range b <= x <= e. Despite this added "power", the algorithm will still run in O(n^2) time :)
First, observe that if b and e have the same number of digits and b != e, then it must be the case that they consist of some number q >= 0 of identical digits at the left, followed by a digit that is larger in e than in b. Now consider the following procedure for generating a random digit string x:
Set x to the first q digits of b.
Append a randomly-chosen digit d between b[i] and e[i] to x.
If d == b[i], we "hug" the lower bound:
For i from q+1 to m:
If b[i] == 9 then append b[i]. [EDIT 10/8/2013: Actually this can't happen, because we chose q so that e[i] will be larger then b[i], and there is no digit larger than 9!]
Otherwise, flip a coin:
Heads: Append b[i].
Tails: Append a randomly-chosen digit d > b[i], then goto 6.
Stop.
Else if d == e[i], we "hug" the upper bound:
For i from q+1 to m:
If e[i] == 0 then append e[i]. [EDIT 10/8/2013: Actually this can't happen, because we chose q so that b[i] will be smaller then e[i], and there is no digit smaller than 0!]
Otherwise, flip a coin:
Heads: Append e[i].
Tails: Append a randomly-chosen digit d < e[i], then goto 6.
Stop.
Otherwise (if d is strictly between b[i] and e[i]), drop through to step 6.
Keep appending randomly-chosen digits to x until it has m digits.
The basic idea is that after including all the digits that you must include, you can either "hug" the lower bound's digits for as long as you want, or "hug" the upper bound's digits for as long as you want, and as soon as you decide to stop "hugging", you can thereafter choose any digits you want. For suitable random choices, this procedure will generate all and only the numbers x such that b <= x <= e.
In the "usual" Levenshtein distance computation between two strings s and x, of lengths n and m respectively, we have a rectangular grid from (0, 0) to (n, m), and at each grid point (i, j) we record the Levenshtein distance between the prefix s[1..i] and the prefix x[1..j]. The score at (i, j) is calculated from the scores at (i-1, j), (i, j-1) and (i-1, j-1) using bottom-up dynamic programming. To adapt this to treat x as one of a set of possible strings (specifically, a digit string corresponding to a number between b and e) instead of a particular given string, what we need to do is record not one but two scores for each grid point: one for the case where we assume that the digit at position j was chosen to hug the lower bound, and one where we assume it was chosen to hug the upper bound. The 3rd possibility (step 5 above) doesn't actually require space in the DP matrix because we can work out the minimal Levenshtein distance for the entire rest of the input string immediately, very similar to the way we work it out for the "easy" subproblems in the first section.
Super-Levenshtein DP recursion
Call the overall minimal score at grid point (i, j) v(i, j). Let diff(a, b) = 1 if characters a and b are different, and 0 otherwise. Let inrange(a, b..c) be 1 if the character a is in the range b..c, and 0 otherwise. The calculations are:
# The best Lev distance overall between s[1..i] and x[1..j]
v(i, j) = min(hb(i, j), he(i, j))
# The best Lev distance between s[1..i] and x[1..j] obtainable by
# continuing to hug the lower bound
hb(i, j) = min(hb(i-1, j)+1, hb(i, j-1)+1, hb(i-1, j-1)+diff(s[i], b[j]))
# The best Lev distance between s[1..i] and x[1..j] obtainable by
# continuing to hug the upper bound
he(i, j) = min(he(i-1, j)+1, he(i, j-1)+1, he(i-1, j-1)+diff(s[i], e[j]))
At the point in time when v(i, j) is being calculated, we will also calculate the Levenshtein distance resulting from choosing to "stop hugging", i.e. by choosing a digit that is strictly in between b[j] and e[j] (if j == q) or (if j != q) is either above b[j] or below e[j], and thereafter freely choosing digits to make the suffix of x match the suffix of s as closely as possible:
# The best Lev distance possible between the ENTIRE STRINGS s and x, given that
# we choose to stop hugging at the jth digit of x, and have optimally aligned
# the first i digits of s to these j digits
sh(i, j) = if j >= q then shc(i, j)+abs(n-i-m+j)
else infinity
shc(i, j) = if j == q then
min(hb(i, j-1)+1, hb(i-1, j-1)+inrange(s[i], (b[j]+1)..(e[j]-1)))
else
min(hb(i, j-1)+1, hb(i-1, j-1)+inrange(s[i], (b[j]+1)..9),
he(i, j-1)+1, he(i-1, j-1)+inrange(s[i], (0..(e[j]-1)))
The formula for shc(i, j) doesn't need to consider "downward" moves, since such moves don't involve any digit choice for x.
The overall minimal Levenshtein distance is the minimum of v(n, m) and sh(i, j), for all 0 <= i <= n and 0 <= j <= m.
Complexity
Take N to be the largest number of digits in any of s, b or e. The original problem can be split in linear time into at most 1 set of easy problems that collectively takes O(1) time to solve and 2 hard subproblems that each take O(N^2) time to solve using the super-Levenshtein algorithm, so overall the problem can be solved in O(N^2) time, i.e. time proportional to the square of the number of digits.
A first idea to speed up the computation (works if |e-b| is not too large):
Question: how much can the Levestein distance change when we compare s with n and then with n+1?
Answer: not too much!
Let's see the dynamic-programming tables for s = 12007 and two consecutive n
n = 12296
0 1 2 3 4 5
1 0 1 2 3 4
2 1 0 1 2 3
3 2 1 1 2 3
4 3 2 2 2 3
5 4 3 3 3 3
and
n = 12297
0 1 2 3 4 5
1 0 1 2 3 4
2 1 0 1 2 3
3 2 1 1 2 3
4 3 2 2 2 3
5 4 3 3 3 2
As you can see, only the last column changes, since n and n+1 have the same digits, except for the last one.
If you have the dynamic-programming table for the edit-distance of s = 12001 and n = 12296, you already have the table for n = 12297, you just need to update the last column!
Obviously if n = 12299 then n+1 = 12300 and you need to update the last 3 columns of the previous table.. but this happens just once every 100 iteration.
In general, you have to
update the last column on every iterations (so, length(s) cells)
update the second-to-last too, once every 10 iterations
update the third-to-last, too, once every 100 iterations
so let L = length(s) and D = e-b. First you compute the edit-distance between s and b. Then you can find the minimum Levenstein distance over [b,e] looping over every integer in the interval. There are D of them, so the execution time is about:
Now since
we have an algorithm wich is

Algorithm puzzle interview

I found this interview question, and I couldn't come up with an algorithm better than O(N^2 * P):
Given a vector of P natural numbers (1,2,3,...,P) and another vector of length N whose elements are from the first vector, find the longest subsequence in the second vector, such that all elements are uniformly distributed (have the same frequency).
Example : (1,2,3) and (1,2,1,3,2,1,3,1,2,3,1). The longest subsequence is in the interval [2,10], because it contains all the elements from the first sequence with the same frequency (1 appears three times, 2 three times, and 3 three times).
The time complexity should be O(N * P).
"Subsequence" usually means noncontiguous. I'm going to assume that you meant "sublist".
Here's an O(N P) algorithm assuming we can hash (assumption not needed; we can radix sort instead). Scan the array keeping a running total for each number. For your example,
1 2 3
--------
0 0 0
1
1 0 0
2
1 1 0
1
2 1 0
3
2 1 1
2
2 2 1
1
3 2 1
3
3 2 2
1
4 2 2
2
4 3 2
3
4 3 3
1
5 3 3
Now, normalize each row by subtracting the minimum element. The result is
0: 000
1: 100
2: 110
3: 210
4: 100
5: 110
6: 210
7: 100
8: 200
9: 210
10: 100
11: 200.
Prepare two hashes, mapping each row to the first index at which it appears and the last index at which it appears. Iterate through the keys and take the one with maximum last - first.
000: first is at 0, last is at 0
100: first is at 1, last is at 10
110: first is at 2, last is at 5
210: first is at 3, last is at 9
200: first is at 8, last is at 11
The best key is 100, since its sublist has length 9. The sublist is the (1+1)th element to the 10th.
This works because a sublist is balanced if and only if its first and last unnormalized histograms are the same up to adding a constant, which occurs if and only if the first and last normalized histograms are identical.
If the memory usage is not important, it's easy...
You can give the matrix dimensions N*p and save in column (i) the value corresponding to how many elements p is looking between (i) first element in the second vector...
After completing the matrix, you can search for column i that all of the elements in column i are not different. The maximum i is the answer.
With randomization, you can get it down to linear time. The idea is to replace each of the P values with a random integer, such that those integers sum to zero. Now look for two prefix sums that are equal. This allows some small chance of false positives, which we could remedy by checking our output.
In Python 2.7:
# input:
vec1 = [1, 2, 3]
P = len(vec1)
vec2 = [1, 2, 1, 3, 2, 1, 3, 1, 2, 3, 1]
N = len(vec2)
# Choose big enough integer B. For each k in vec1, choose
# a random mod-B remainder r[k], so their mod-B sum is 0.
# Any P-1 of these remainders are independent.
import random
B = N*N*N
r = dict((k, random.randint(0,B-1)) for k in vec1)
s = sum(r.values())%B
r[vec1[0]] = (r[vec1[0]]+B-s)%B
assert sum(r.values())%B == 0
# For 0<=i<=N, let vec3[i] be mod-B sum of r[vec2[j]], for j<i.
vec3 = [0] * (N+1)
for i in range(1,N+1):
vec3[i] = (vec3[i-1] + r[vec2[i-1]]) % B
# Find pair (i,j) so vec3[i]==vec3[j], and j-i is as large as possible.
# This is either a solution (subsequence vec2[i:j] is uniform) or a false
# positive. The expected number of false positives is < N*N/(2*B) < 1/N.
(i, j)=(0, 0)
first = {}
for k in range(N+1):
v = vec3[k]
if v in first:
if k-first[v] > j-i:
(i, j) = (first[v], k)
else:
first[v] = k
# output:
print "Found subsequence from", i, "(inclusive) to", j, "(exclusive):"
print vec2[i:j]
print "This is either uniform, or rarely, it is a false positive."
Here is an observation: you can't get a uniformly distributed sequence that is not a multiplication of P in length. This implies that you only have to check the sub-sequences of N that are P, 2P, 3P... long - (N/P)^2 such sequences.
You can get this down to O(N) time, with no dependence on P by enhancing uty's solution.
For each row, instead of storing the normalized counts of each element, store a hash of the normalized counts while only keeping the normalized counts for the current index. During each iteration, you need to first update the normalized counts, which has an amortized cost of O(1) if each decrement of a count is paid for when it is incremented. Next you recompute the hash. The key here is that the hash needs to be easily updatable following an increment or decrement of one of the elements of the tuple that is being hashed.
At least one way of doing this hashing efficiently, with good theoretical independence guarantees is shown in the answer to this question. Note that the O(lg P) cost for computing the exponential to determine the amount to add to the hash can be eliminated by precomputing the exponentials modulo the prime in with a total running time of O(P) for the precomputation, giving a total running time of O(N + P) = O(N).

Resources