Generating a sequence of n random numbers without duplicates with a space complexity of O(log(n)) - algorithm

I would like to generate a sequence of n random integers in the interval [1,n] without duplicates, i.e. a permutation of the sequence [1,2,...,n] with O(log(n)) space complexity (or a polynomial function of log(n)).
One hint is that I can assume that I have a family of l-wise uniform hash functions h : [n] -> [k] (with l<=n) such that for any y_1, y_2,..., y_l and any distinct x_1, x_2,..., x_l :
P(h(x_1) = y_1 and h(x_2) = y_2 and ... and h(x_l) = y_l) = 1/(k^l)
My first idea was to use the hash function to generate the i-th element of the sequence, i.e. x_i = h(i) , check if x_i is already used (has already been returned by the hash function for some 0<j<i) and if it's the case increment x_i by 1 and check again until x_i is a new number. My problem is I can not have a vector of booleans of size n to check if the value x_i is already used. And if I do a recursive function to get the j-th value I will need at some point O(n log2(n)) bits...
I also found here that pseudorandom generator like Linear congruential generator can be used for this kind of problem with something like x_i+1 = (a*x_i + c)%n + 1 but I am not sure to understand how to choose a for any value of n to have a period of length n. In that case the hint is not really useful except for generating the first number of the sequence thus I don't think it's the right way.

Here's a fun super simple solution with constant space; when N is a power of 2 and your definition of "random" is incredibly loose (the resulting sequence will alternate between even and odd numbers).
N = power of 2
P = prime number larger than N.
S = random starting number between 0 and N-1
For i = 1 TO N
// add our prime to the starting random number
S += P
// S Modulus N
// Bitwise And N-1 works because N is a pow of 2
T = S & (N - 1)
//T is [0, (N-1)] => we want [1, N]
PRINT (T + 1)
Next I
JS
for(let N = 64, P = 73, S = N * Math.random(), i = 1; i <= N; i++) { S += P; console.log((S & (N - 1)) + 1); }
Another answer would probably be to consider all of the numbers [1, N] as leaf nodes in a tree and your Log(N) space is the size of a the path through the tree. Your solution would be a function that permutes all N paths through the tree. The way you permute the paths in a pseudo random way would basically be a Linear Feedback Shift Register type generator that has a period grater than N.
https://www.maximintegrated.com/en/design/technical-documents/app-notes/4/4400.html

Related

Generate one permutation from an index

Is there an efficient algorithm to generate a permutation from one index provided? The permutations do not need to have any specific ordering and it just needs to return every permutation once per every possible index. The set I wish to permute is all integers from 0~255.
If I understand the question correctly, the problem is as follows: You are given two integers n and k, and you want to find the kth permutation of n integers. You don't care about it being the kth lexicographical permutation, but it's just easier to be lexicographical so let's stick with that.
This is not too bad to compute. The base permutation is 1,2,3,4...n. This is the k=0 case. Consider what happens if you were to swap the 1 and 2: by moving the 1, you are passing up every single permutation where 1 goes first, and there are (n-1)! of those (since you could have permuted 2,3,4..n if you fixed the 1 in place). Thus, the algorithm is as follows:
for i from 1 to n:
j = k / (n-i)! // integer division, so rounded down
k -= j * (n-i)!
place down the jth unplaced number
This will iteratively produce the kth lexicographical permutation, since it repeatedly solves a sub-problem with a smaller set of numbers to place, and decrementing k along the way.
There is an implementation in python in module more-itertools: nth_permutation.
Here is an implementation, adapted from the code of more_itertools.nth_permutation:
from sympy import factorial
def nth_permutation(iterable, index):
pool = list(iterable)
n = len(pool)
c = factorial(n)
index = index % c
result = [0] * n
q = index
for d in range(1, n + 1):
q, i = divmod(q, d)
if 0 <= n - d < n:
result[n - d] = i
if q == 0:
break
return tuple(map(pool.pop, result))
print( nth_permutation(range(6), 360) )
# (3, 0, 1, 2, 4, 5)

Generating random sublist from ordered list that maintains ordering

Consider a problem where a random sublist of k items, Y, must be selected from X, a list of n items, where the items in Y must appear in the same order as they do in X. The selected items in Y need not be distinct. One solution is this:
for i = 1 to k
A[i] = floor(rand * n) + 1
Y[i] = X[A[i]]
sort Y according to the ordering of A
However, this has running time O(k log k) due to the sort operation. To remove this it's tempting to
high_index = n
for i = 1 to k
index = floor(rand * high_index) + 1
Y[k - i + 1] = X[index]
high_index = index
But this gives a clear bias to the returned list due to the uniform index selection. It feels like a O(k) solution is attainable if the indices in the second solution were distributed non-uniformly. Does anyone know if this is the case, and if so what properties the distribution the marginal indices are drawn from has?
Unbiased O(n+k) solution is trivial, high-level pseudo code.
create an empty histogram of size n [initialized with all elements as zeros]
populate it with k uniformly distributed variables at range. (do k times histogram[inclusiveRand(1,n)]++)
iterate the initial list [A], while decreasing elements in the histogram and appending elements to the result list.
Explanation [edit]:
The idea is to chose k elements out of n at random, with uniform
distribution for each, and create a histogram out of it.
This histogram now contains for each index i, how many times A[i] will appear in the resulting Y list.
Now, iterate the list A in-order, and for each element i, insert A[i] into the resulting Y list histogram[i] times.
This guarantees you maintain the order because you insert elements in order, and "never go back".
It also guarantees unbiased solution since for each i,j,K: P(histogram[i]=K) = P(histogram[j]=K), so for each K, each element has the same probability to appear in the resulting list K times.
I believe it can be done in O(k) using the order statistics [X(i)] but I cannot figure it out though :\
By your first algorithm, it suffices to generate k uniform random samples of [0, 1) in sorted order.
Let X1, ..., Xk be these samples. Given that Xk = x, the conditional distribution of X1, ..., Xk-1 is k - 1 uniform random samples of [0, x) in sorted order, so it suffices to sample Xk and recurse.
What's the probability that Xk < x? Each of k independent samples of [0, 1) must be less than x, so the answer (the cumulative distribution function for Xk) is x^k. To sample according to the cdf, all we have to do is invert it on a uniform random sample of [0, 1): pow(random(), 1.0 / k).
Here's an (expected) O(k) algorithm I actually would consider implementing. The idea is to dump the samples into k bins, sort each bin, and concatenate. Here's some untested Python:
def samples(n, k):
bins = [[] for i in range(k)]
for i in range(k):
x = randrange(n)
bins[(x * k) // n].append(x)
result = []
for bin in bins:
bin.sort()
result.extend(bin)
return result
Why is this efficient in expectation? Let's suppose we use insertion sort on each bin (each bin has expected size O(1)!). On top of operations that are O(k), we're going to pay proportionally to the number of sum of the squares of the bin sizes, which is basically the number of collisions. Since the probability of two samples colliding is at most something like 4/k and we have O(k^2) pairs of samples, the expected number of collisions is O(k).
I suspect rather strongly that the O(k) guarantee can be made with high probability.
You can use counting sort to sort Y and thus make the sorting linear with respect to k. However for that you need one additional array of length n. If we assume you have already allocated that, you may execute the code you are asking for arbitrary many times with complexity O(k).
The idea is just as you describe, but I will use one more array cnt of size n that I assume is initialized to 0, and another "stack" st that I assume is empty.
for i = 1 to k
A[i] = floor(rand * n) + 1
cnt[A[i]]+=1
if cnt[A[i]] == 1 // Needed to be able to traverse the inserted elements faster
st.push(A[i])
for elem in st
for i = 0 to cnt[elem]
Y.add(X[elem])
for elem in st
cnt[elem] = 0
EDIT: as mentioned by oldboy what I state in the post is not true - I still have to sort st, which might be a bit better then the original proposition but not too much. So This approach will only be good if k is comparable to n and then we just iterate trough cnt linearly and construct Y this way. This way st is not needed:
for i = 1 to k
A[i] = floor(rand * n) + 1
cnt[A[i]]+=1
for i = 1 to k
for j = 0 to cnt[i]
Y.add(X[i])
cnt[i] =0
For the first index in Y, the distribution of indices in X is given by:
P(x; n, k) = binomial(n - x + k - 2, k - 1) / norm
where binomial denotes calculation of the binomial coefficient, and norm is a normalisation factor, equal to the total number of possible sublist configurations.
norm = binomial(n + k - 1, k)
So for k = 5 and n = 10 we have:
norm = 2002
P(x = 0) = 0.357, P(x <= 0) = 0.357
P(x = 1) = 0.245, P(x <= 1) = 0.604
P(x = 2) = 0.165, P(x <= 2) = 0.769
P(x = 3) = 0.105, P(x <= 3) = 0.874
P(x = 4) = 0.063, P(x <= 4) = 0.937
... (we can continue this up to x = 10)
We can sample the X index of the first item in Y from this distribution (call it x1). The distribution of the second index in Y can then be sampled in the same way with P(x; (n - x1), (k - 1)), and so on for all subsequent indices.
My feeling now is that the problem is not solvable in O(k), because in general we are unable to sample from the distribution described in constant time. If k = 2 then we can solve in constant time using the quadratic formula (because the probability function simplifies to 0.5(x^2 + x)) but I can't see a way to extend this to all k (my maths isn't great though).
The original list X has n items. There are 2**n possible sublists, since every item will or will not appear in the resulting sublist: each item adds a bit to the enumeration of the possible sublists. You could view this enumeration of a bitword of n bits.
Since your are only want sublists with k items, you are interested in bitwords with exactly k bits set.
A practical algorithm could pick (or pick not) the first element from X, and then recurse into the rightmost n-1 substring of X, taking into account the accumulated number of chosen items. Since the X list is processed in order, the Y list will also be in order.
The original list X has n items. There are 2**n possible sublists, since every item will or will not appear in a sublist: each item adds a bit to the enumeration of the possible sublists. You could view this enumeration of a bitword of n bits.
Since your are only want sublists with k items, you are interested in bitwords with exactly k bits set. A practical algorithm could pick (or pick not) the first element from X, and then recurse into the rightmost n-1 substring of X, taking into account the accumulated number of chosen items. Since the X list is processed in order, the Y list will also be in order.
#include <stdio.h>
#include <string.h>
unsigned pick_k_from_n(char target[], char src[], unsigned k, unsigned n, unsigned done);
unsigned pick_k_from_n(char target[], char src[]
, unsigned k, unsigned n, unsigned done)
{
unsigned count=0;
if (k>n) return 0;
if (k==0) {
target[done] = 0;
puts(target);
return 1;
}
if (n > 0) {
count += pick_k_from_n(target, src+1, k, n-1, done);
target[done] = *src;
count += pick_k_from_n(target, src+1, k-1, n-1, done+1);
}
return count;
}
int main(int argc, char **argv) {
char result[20];
char *domain = "OmgWtf!";
unsigned cnt ,len, want;
want = 3;
switch (argc) {
default:
case 3:
domain = argv[2];
case 2:
sscanf(argv[1], "%u", &want);
case 1:
break;
}
len = strlen(domain);
cnt = pick_k_from_n(result, domain, want, len, 0);
fprintf(stderr, "Count=%u\n", cnt);
return 0;
}
Removing the recursion is left as an exercise to the reader.
Some output:
plasser#pisbak:~/hiero/src$ ./a.out 3 ABBA
BBA
ABA
ABA
ABB
Count=4
plasser#pisbak:~/hiero/src$

The expected number of inversions--From Introduction to Algorithms by Cormen

Let A[1 .. n] be an array of n distinct numbers. If i < j and A[i] > A[j], then the pair (i, j) is called an inversion of A. (See Problem 2-4 for more on inversions.) Suppose that each element of A is chosen randomly, independently, and uniformly from the range 1 through n. Use indicator random variables to compute the expected number of inversions.
The problem is from exercise 5.2-5 in Introduction to Algorithms by Cormen. Here is my recursive solution:
Suppose x(i) is the number of inversions in a[1..i], and E(i) is the expected value of x(i), then E(i+1) can be computed as following:
Image we have i+1 positions to place all the numbers, if we place i+1 on the first position, then x(i+1) = i + x(i); if we place i+1 on the second position, then x(i+1) = i-1 + x(i),..., so E(i+1) = 1/(i+1)* sum(k) + E(i), where k = [0,i]. Finally we get E(i+1) = i/2 + E(i).
Because we know that E(2) = 0.5, so recursively we get: E(n) = (n-1 + n-2 + ... + 2)/2 + 0.5 = n* (n-1)/4.
Although the deduction above seems to be right, but I am still not very sure of that. So I share it here.
If there is something wrong, please correct me.
All the solutions seem to be correct, but the problem says that we should use indicator random variables. So here is my solution using the same:
Let Eij be the event that i < j and A[i] > A[j].
Let Xij = I{Eij} = {1 if (i, j) is an inversion of A
0 if (i, j) is not an inversion of A}
Let X = Σ(i=1 to n)Σ(j=1 to n)(Xij) = No. of inversions of A.
E[X] = E[Σ(i=1 to n)Σ(j=1 to n)(Xij)]
= Σ(i=1 to n)Σ(j=1 to n)(E[Xij])
= Σ(i=1 to n)Σ(j=1 to n)(P(Eij))
= Σ(i=1 to n)Σ(j=i + 1 to n)(P(Eij)) (as we must have i < j)
= Σ(i=1 to n)Σ(j=i + 1 to n)(1/2) (we can choose the two numbers in
C(n, 2) ways and arrange them
as required. So P(Eij) = C(n, 2) / n(n-1))
= Σ(i=1 to n)((n - i)/2)
= n(n - 1)/4
Another solution is even simpler, IMO, although it does not use "indicator random variables".
Since all of the numbers are distinct, every pair of elements is either an inversion (i < j with A[i] > A[j]) or a non-inversion (i < j with A[i] < A[j]). Put another way, every pair of numbers is either in order or out of order.
So for any given permutation, the total number of inversions plus non-inversions is just the total number of pairs, or n*(n-1)/2.
By symmetry of "less than" and "greater than", the expected number of inversions equals the expected number of non-inversions.
Since the expectation of their sum is n*(n-1)/2 (constant for all permutations), and they are equal, they are each half of that or n*(n-1)/4.
[Update 1]
Apparently my "symmetry of 'less than' and 'greater than'" statement requires some elaboration.
For any array of numbers A in the range 1 through n, define ~A as the array you get when you subtract each number from n+1. For example, if A is [2,3,1], then ~A is [2,1,3].
Now, observe that for any pair of numbers in A that are in order, the corresponding elements of ~A are out of order. (Easy to show because negating two numbers exchanges their ordering.) This mapping explicitly shows the symmetry (duality) between less-than and greater-than in this context.
So, for any A, the number of inversions equals the number of non-inversions in ~A. But for every possible A, there corresponds exactly one ~A; when the numbers are chosen uniformly, both A and ~A are equally likely. Therefore the expected number of inversions in A equals the expected number of inversions in ~A, because these expectations are being calculated over the exact same space.
Therefore the expected number of inversions in A equals the expected number of non-inversions. The sum of these expectations is the expectation of the sum, which is the constant n*(n-1)/2, or the total number of pairs.
[Update 2]
A simpler symmetry: For any array A of n elements, define ~A as the same elements but in reverse order. Associate the element at position i in A with the element at position n+1-i in ~A. (That is, associate each element with itself in the reversed array.)
Now any inversion in A is associated with a non-inversion in ~A, just as with the construction in Update 1 above. So the same argument applies: The number of inversions in A equals the number of inversions in ~A; both A and ~A are equally likely sequences; etc.
The point of the intuition here is that the "less than" and "greater than" operators are just mirror images of each other, which you can see either by negating the arguments (as in Update 1) or by swapping them (as in Update 2). So the expected number of inversions and non-inversions is the same, since you cannot tell whether you are looking at any particular array through a mirror or not.
Even simpler (similar to Aman's answer above, but perhaps clearer) ...
Let Xij be a random variable with Xij=1 if A[i] > A[j] and Xij=0 otherwise.
Let X=sum(Xij) over i, j where i < j
Number of pairs (ij)*: n(n-1)/2
Probability that Xij=1 (Pr(Xij=1))): 1/2
By linearity of expectation**: E(X) = E(sum(Xij))
= sum(E(Xij))
= sum(Pr(Xij=1))
= n(n-1)/2 * 1/2
= n(n-1)/4
* I think of this as the size of the upper triangle of a square matrix.
** All sums here are over i, j, where i < j.
I think it's right, but I think the proper way to prove it is to use conditionnal expectations :
for all X and Y we have : E[X] =E [E [X|Y]]
then in your case :
E(i+1) = E[x(i+1)] = E[E[x(i+1) | x(i)]] = E[SUM(k)/(1+i) + x(i)] = i/2 + E[x(i)] = i/2 + E(i)
about the second statement :
if :
E(n) = n* (n-1)/4.
then E(n+1) = (n+1)*n/4 = (n-1)*n/4 + 2*n/4 = (n-1)*n/4 + n/2 = E(n) +n/2
So n* (n-1)/4. verify the recursion relation for all n >=2 and it verifies it for n=2
So E(n) = n*(n-1)/4
Hope I understood your problem and it helps
Using indicator random variables:
Let X = random variable which is equal to the number of inversions.
Let Xij = 1 if A[i] and A[j] form an inversion pair, and Xij = 0 otherwise.
Number of inversion pairs = Sum over 1 <= i < j <= n of (Xij)
Now P[Xij = 1] = P[A[i] > A[j]] = (n choose 2) / (2! * n choose 2) = 1/2
E[X] = E[sum over all ij pairs such that i < j of Xij] = sum over all ij pairs such that i < j of E[Xij] = n(n - 1) / 4

Algorithm to find smallest N such that N! is divisible by a prime raised to a power

Is there an efficient algorithm to compute the smallest integer N such that N! is divisible by p^k where p is a relatively small prime number and k, a very large integer. In other words,
factorial(N) mod p^k == 0
If, given N and p, I wanted to find how many times p divides into N!, I would use the well-known formula
k = Sum(floor(N/p^i) for i=1,2,...
I've done brute force searches for small values of k but that approach breaks down very quickly as k increases and there doesn't appear to be a pattern that I can extrapolate to larger values.
Edited 6/13/2011
Using suggestions proposed by Fiver and Hammar, I used a quasi-binary search to solve the problem but not quite in the manner they suggested. Using a truncated version of the second formula above, I computed an upper bound on N as the product of k and p (using just the first term). I used 1 as the lower bound. Using the classic binary search algorithm, I computed the midpoint between these two values and calculated what k would be using this midpoint value as N in the second formula, this time with all the terms being used.
If the computed k was too small, I adjusted the lower bound and repeated. Too big, I first tested to see if k computed at midpoint-1 was smaller than the desired k. If so, midpoint was returned as the closest N. Otherwise, I adjusted the highpoint and repeated.
If the computed k were equal, I tested whether the value at midpoint-1 was equal to the value at midpoint. If so, I adjusted the highpoint to be the midpoint and repeated. If midpoint-1 was less than the desired k, the midpoint was returned as the desired answer.
Even with very large values for k (10 or more digits), this approach works O(n log(n)) speeds.
OK this is kind of fun.
Define f(i) = (p^i - 1) / (p - 1)
Write k in a kind of funny "base" where the value of position i is this f(i).
You do this from most-significant to least-significant digit. So first, find the largest j such that f(j) <= k. Then compute the quotient and remainder of k / f(j). Store the quotient as q_j and the remainder as r. Now compute the quotient and remainder of r / f(j-1). Store the quotient as q_{j-1} and the remainder as r again. Now compute the quotient and remainder of r / f(j-2). And so on.
This generates a sequence q_j, q_{j-1}, q_{j-2}, ..., q_1. (Note that the sequence ends at 1, not 0.) Then compute q_j*p^j + q_{j-1}*p^(j-1) + ... q_1*p. That's your N.
Example: k = 9, p = 3. So f(i) = (3^i - 1) / 2. f(1) = 1, f(2) = 4, f(3) = 13. So the largest j with f(j) <= 9 is i = 2 with f(2) = 4. Take the quotient and remainder of 9 / 4. That's a quotient of 2 (which is the digit in our 2's place) and remainder of 1.
For that remainder of 1, find the quotient and remainder of 1 / f(1). Quotient is 1, remainder is zero, so we are done.
So q_2 = 2, q_1 = 1. 2*3^2 + 1*3^1 = 21, which is the right N.
I have an explanation on paper for why this works, but I am not sure how to communicate it in text... Note that f(i) answers the question, "how many factors of p are there in (p^i)!". Once you find the largest i,j such that j*f(i) is less than k, and realize what you are really doing is finding the largest j*p^i less than N, the rest kind of falls out of the wash. In our p=3 example, for instance, we get 4 p's contributed by the product of 1-9, 4 more contributed by the product of 10-18, and one more contributed by 21. Those first two are just multiples of p^2; f(2) = 4 is telling us that each multiple of p^2 contributes 4 more p's to the product.
[update]
Code always helps to clarify. Save the following perl script as foo.pl and run it as foo.pl <p> <k>. Note that ** is Perl's exponentiation operator, bdiv computes a quotient and remainder for BigInts (unlimited-precision integers), and use bigint tells Perl to use BigInts everywhere.
#!/usr/bin/env perl
use warnings;
use strict;
use bigint;
#ARGV == 2
or die "Usage: $0 <p> <k>\n";
my ($p, $k) = map { Math::BigInt->new($_) } #ARGV;
sub f {
my $i = shift;
return ($p ** $i - 1) / ($p - 1);
}
my $j = 0;
while (f($j) <= $k) {
$j++;
}
$j--;
my $N = 0;
my $r = $k;
while ($r > 0) {
my $val = f($j);
my ($q, $new_r) = $r->bdiv($val);
$N += $q * ($p ** $j);
$r = $new_r;
$j--;
}
print "Result: $N\n";
exit 0;
Using the formula you mentioned, the sequence of k values given fixed p and N = 1,2... is non-decreasing. This means you can use a variant of binary search to find N given the desired k.
Start with N = 1, and calculate k.
Double N until k is greater or equal than your desired k to get an upper bound.
Do a binary search on the remaining interval to find your k.
Why don't you try binary search for the answer, using the second formula you mentioned?
You only need to consider values for N, for which p divides N, because if it doesn't, then N! and (N-1)! are divided by the same power of p, so N can't be the smallest one.
Consider
I = (pn)!
and ignore prime factors other than p. The result looks like
I = pn * pn-1 * pn-2 * ... * p * 1
I = pn + (n-1) + (n-2) + ... 2 + 1
I = p(n2 +n)/2
So we're trying to find the smallest n such that
(n2 +n)/2 >= k
which if I remember the quadratic equation right gives us
N = pn, where n >= (sqrt(1+8k) -1)/2
(P.S. Does anyone know how to show the radical symbol in markdown?)
EDIT:
This is wrong. Let me see if I can salvage it...

Calculating sum of geometric series (mod m)

I have a series
S = i^(m) + i^(2m) + ............... + i^(km) (mod m)
0 <= i < m, k may be very large (up to 100,000,000), m <= 300000
I want to find the sum. I cannot apply the Geometric Progression (GP) formula because then result will have denominator and then I will have to find modular inverse which may not exist (if the denominator and m are not coprime).
So I made an alternate algorithm making an assumption that these powers will make a cycle of length much smaller than k (because it is a modular equation and so I would obtain something like 2,7,9,1,2,7,9,1....) and that cycle will repeat in the above series. So instead of iterating from 0 to k, I would just find the sum of numbers in a cycle and then calculate the number of cycles in the above series and multiply them. So I first found i^m (mod m) and then multiplied this number again and again taking modulo at each step until I reached the first element again.
But when I actually coded the algorithm, for some values of i, I got cycles which were of very large size. And hence took a large amount of time before terminating and hence my assumption is incorrect.
So is there any other pattern we can find out? (Basically I don't want to iterate over k.)
So please give me an idea of an efficient algorithm to find the sum.
This is the algorithm for a similar problem I encountered
You probably know that one can calculate the power of a number in logarithmic time. You can also do so for calculating the sum of the geometric series. Since it holds that
1 + a + a^2 + ... + a^(2*n+1) = (1 + a) * (1 + (a^2) + (a^2)^2 + ... + (a^2)^n),
you can recursively calculate the geometric series on the right hand to get the result.
This way you do not need division, so you can take the remainder of the sum (and of intermediate results) modulo any number you want.
As you've noted, doing the calculation for an arbitrary modulus m is difficult because many values might not have a multiplicative inverse mod m. However, if you can solve it for a carefully selected set of alternate moduli, you can combine them to obtain a solution mod m.
Factor m into p_1, p_2, p_3 ... p_n such that each p_i is a power of a distinct prime
Since each p is a distinct prime power, they are pairwise coprime. If we can calculate the sum of the series with respect to each modulus p_i, we can use the Chinese Remainder Theorem to reassemble them into a solution mod m.
For each prime power modulus, there are two trivial special cases:
If i^m is congruent to 0 mod p_i, the sum is trivially 0.
If i^m is congruent to 1 mod p_i, then the sum is congruent to k mod p_i.
For other values, one can apply the usual formula for the sum of a geometric sequence:
S = sum(j=0 to k, (i^m)^j) = ((i^m)^(k+1) - 1) / (i^m - 1)
TODO: Prove that (i^m - 1) is coprime to p_i or find an alternate solution for when they have a nontrivial GCD. Hopefully the fact that p_i is a prime power and also a divisor of m will be of some use... If p_i is a divisor of i. the condition holds. If p_i is prime (as opposed to a prime power), then either the special case i^m = 1 applies, or (i^m - 1) has a multiplicative inverse.
If the geometric sum formula isn't usable for some p_i, you could rearrange the calculation so you only need to iterate from 1 to p_i instead of 1 to k, taking advantage of the fact that the terms repeat with a period no longer than p_i.
(Since your series doesn't contain a j=0 term, the value you want is actually S-1.)
This yields a set of congruences mod p_i, which satisfy the requirements of the CRT.
The procedure for combining them into a solution mod m is described in the above link, so I won't repeat it here.
This can be done via the method of repeated squaring, which is O(log(k)) time, or O(log(k)log(m)) time, if you consider m a variable.
In general, a[n]=1+b+b^2+... b^(n-1) mod m can be computed by noting that:
a[j+k]==b^{j}a[k]+a[j]
a[2n]==(b^n+1)a[n]
The second just being the corollary for the first.
In your case, b=i^m can be computed in O(log m) time.
The following Python code implements this:
def geometric(n,b,m):
T=1
e=b%m
total = 0
while n>0:
if n&1==1:
total = (e*total + T)%m
T = ((e+1)*T)%m
e = (e*e)%m
n = n/2
//print '{} {} {}'.format(total,T,e)
return total
This bit of magic has a mathematical reason - the operation on pairs defined as
(a,r)#(b,s)=(ab,as+r)
is associative, and the rule 1 basically means that:
(b,1)#(b,1)#... n times ... #(b,1)=(b^n,1+b+b^2+...+b^(n-1))
Repeated squaring always works when operations are associative. In this case, the # operator is O(log(m)) time, so repeated squaring takes O(log(n)log(m)).
One way to look at this is that the matrix exponentiation:
[[b,1],[0,1]]^n == [[b^n,1+b+...+b^(n-1))],[0,1]]
You can use a similar method to compute (a^n-b^n)/(a-b) modulo m because matrix exponentiation gives:
[[b,1],[0,a]]^n == [[b^n,a^(n-1)+a^(n-2)b+...+ab^(n-2)+b^(n-1)],[0,a^n]]
Based on the approach of #braindoper a complete algorithm which calculates
1 + a + a^2 + ... +a^n mod m
looks like this in Mathematica:
geometricSeriesMod[a_, n_, m_] :=
Module[ {q = a, exp = n, factor = 1, sum = 0, temp},
While[And[exp > 0, q != 0],
If[EvenQ[exp],
temp = Mod[factor*PowerMod[q, exp, m], m];
sum = Mod[sum + temp, m];
exp--];
factor = Mod[Mod[1 + q, m]*factor, m];
q = Mod[q*q, m];
exp = Floor[ exp /2];
];
Return [Mod[sum + factor, m]]
]
Parameters:
a is the "ratio" of the series. It can be any integer (including zero and negative values).
n is the highest exponent of the series. Allowed are integers >= 0.
mis the integer modulus != 0
Note: The algorithm performs a Mod operation after every arithmetic operation. This is essential, if you transcribe this algorithm to a language with a limited word length for integers.

Resources