I have an array of n random integers
I choose a random integer and partition by the chosen random integer (all integers smaller than the chosen integer will be on the left side, all bigger integers will be on the right side)
What will be the size of my left and right side in the average case, if we assume no duplicates in the array?
I can easily see, that there is 1/n chance that the array is split in half, if we are lucky. Additionally, there is 1/n chance, that the array is split so that the left side is of length 1/2-1 and the right side is of length 1/2+1 and so on.
Could we derive from this observation the "average" case?
You can probably find a better explanation (and certainly the proper citations) in a textbook on randomized algorithms, but here's the gist of average-case QuickSort, in two different ways.
First way
Let C(n) be the expected number of comparisons required on average for a random permutation of 1...n. Since the expectation of the sum of the number of comparisons required for the two recursive calls equals the sum of the expectations, we can write a recurrence that averages over the n possible divisions:
C(0) = 0
1 n−1
C(n) = n−1 + ― sum (C(i) + C(n−1−i))
n i=0
Rather than pull the exact solution out of a hat (or peek at the second way), I'll show you how I'd get an asymptotic bound.
First, I'd guess the asymptotic bound. Obviously I'm familiar with QuickSort and my reasoning here is fabricated, but since the best case is O(n log n) by the Master Theorem, that's a reasonable place to start.
Second, I'd guess an actual bound: 100 n log (n + 1). I use a big constant because why not? It doesn't matter for asymptotic notation and can only make my job easier. I use log (n + 1) instead of log n because log n is undefined for n = 0, and 0 log (0 + 1) = 0 covers the base case.
Third, let's try to verify the inductive step. Assuming that C(i) ≤ 100 i log (i + 1) for all i ∈ {0, ..., n−1},
1 n−1
C(n) = n−1 + ― sum (C(i) + C(n−1−i)) [by definition]
n i=0
2 n−1
= n−1 + ― sum C(i) [by symmetry]
n i=0
2 n−1
≤ n−1 + ― sum 100 i log(i + 1) [by the inductive hypothesis]
n i=0
n
2 /
≤ n−1 + ― | 100 x log(x + 1) dx [upper Darboux sum]
n /
0
2
= n−1 + ― (50 (n² − 1) log (n + 1) − 25 (n − 2) n)
n
[WolframAlpha FTW, I forgot how to integrate]
= n−1 + 100 (n − 1/n) log (n + 1) − 50 (n − 2)
= 100 (n − 1/n) log (n + 1) − 49 n + 100.
Well that's irritating. It's almost what we want but that + 100 messes up the program a little bit. We can extend the base cases to n = 1 and n = 2 by inspection and then assume that n ≥ 3 to finish the bound:
C(n) = 100 (n − 1/n) log (n + 1) − 49 n + 100
≤ 100 n log (n + 1) − 49 n + 100
≤ 100 n log (n + 1). [since n ≥ 3 implies 49 n ≥ 100]
Once again, no one would publish such a messy derivation. I wanted to show how one could work it out formally without knowing the answer ahead of time.
Second way
How else can we derive how many comparisons QuickSort does in expectation? Another possibility is to exploit the linearity of expectation by summing over each pair of elements the probability that those elements are compared. What is that probability? We observe that a pair {i, j} is compared if and only if, at the leaf-most invocation where i and j exist in the array, either i or j is chosen as the pivot. This happens with probability 2/(j+1 − i), since the pivot must be i, j, or one of the j − (i+1) elements that compare between them. Therefore,
n n 2
C(n) = sum sum ―――――――
i=1 j=i+1 j+1 − i
n n+1−i 2
= sum sum ―
i=1 d=2 d
n
= sum 2 (H(n+1−i) − 1) [where H is the harmonic numbers]
i=1
n
= 2 sum H(i) − n
i=1
= 2 (n + 1) (H(n+1) − 1) − n. [WolframAlpha FTW again]
Since H(n) is Θ(log n), this is Θ(n log n), as expected.
Exp(n)
If n = 0
Return 1
End If
If n%2==0
temp = Exp(n/2)
Return temp × temp
Else //n is odd
temp = Exp((n−1)/2)
Return temp × temp × 2
End if
how can i prove by strong induction in n that for all n ≥ 1, the number of multiplications made by
Exp (n) is ≤ 2 log2 n.
ps: Exp(n) = 2^n
A simple way is to use strong induction.
First, prove that Exp(0) terminates and returns 2^0.
Let N be some arbitrary even nonnegative number.
Assume the function Exp correctly calculates and returns 2^n for every n in [0, N].
Under this assumption, prove that Exp(N+1) and Exp(N+2) both terminate and correctly return 2^(N+1) and 2^(N+2).
You're done! By induction it follows that for any nonnegative N, Exp(N) correctly returns 2^N.
PS: Note that in this post, 2^N means "two to the power of N" and not "bitwise xor of the binary representations of 2 and N".
The program exactly applies the following recurrence:
P[0] = 1
n even -> P[n] -> P[n/2]²
n odd -> P[n] -> P[(n-1)/2]².2
the program always terminates, because for n>0, n/2 and (n-1)/2 < n and the argument of the recursive calls always decreases.
P[n] = 2^n is the solution of the recurrence. Indeed,
n = 0 -> 2^0 = 1
n = 2m -> 2^n = (2^m)²
n = 2m+1 -> 2^n = 2.(2^n)²
and this covers all cases.
As every call decreases the number of significant bits of n by one and performs one or two multiplications, the total number does not exceed two times the number of significant bits.
I have got an exercise which requires to find to write a program in which you should find if N! is divided by N^2.
1 ≤ N ≤ 10^9
I wanted to this with the easy way of creating factorial function and dividing it to the power of N but obviously it won't work.
Just algorithm or pseudo-code would be enough
For any n > 4, if n is a prime, then n! is not evenly divisible by n^2.
Here is simple explanation to support my argument:
After n! is divided by n, we are left with (n-1)! in the numerator that needs to be divided by n. So we need n or a multiple of n in the numerator in order for (n-1)! to be evenly divisible by n, which can never happen when n is prime.
While the above will always happen when n is a non-prime. Check it out for yourself by diving into a bit of Number Theory
Hope it helps!!!
Edit: Here is a simple Python code for the above. Complexity is O(sqrt(N)):
def checkPrime(n):
i = 2
while i<n**(1/2.0):
if n%i == 0:
return "Yes" # non-prime, so it's divisible
i = i + 1
return "No" # prime, so not divisible
def main():
n = int(raw_input())
if n==1:
print "Yes"
elif n==4:
print "No"
else:
print checkPrime(n)
main()
Input:
7
Output:
No
This is related to though easier than Wilson's Theorem which says that a number n > 1 is prime if and only if
(n-1)! = -1 (mod n)
This is algebraically equivalent to saying that n>1 is prime if and only if
n! = -n (mod n^2)
Furthermore, it is known and easy to prove that (to quote the Wikipedia article)
With the sole exception of 4, where 3! = 6 ≡ 2 (mod 4), if n is
composite then (n − 1)! is congruent to 0 (mod n).
Hence with the sole exception of 4, if n is composite, (n-1)! = 0 (mod n) hence n! = 0 (mod n^2) and if n is prime, n! = -n = n^2-n (mod n^2) hence n! isn't congruent to 0 in that case.
The full power of Wilson's theorem is needed if you want to show that for prime n, n! leaves a remainder of exactly n^2-n upon division by n^2. For this problem all you need to know is that it isn't zero.
In any event, you could just write a program which runs a primality check, although whether or not that would be considered a valid solution is up to whoever assigned the problem.
I have recently stumbled upon an algorithmic problem and I can't get the end of it. You're given a positive integer N < 10^13, and you need to choose a nonnegative integer M, such that the sum: MN + N(N-1) / 2 has the least number of divisors that lie between 1 and N, inclusive.
Can someone point me to the right direction for solving this problem?
Thank you for your time.
Find a prime P greater than N. There are a number of ways to do this.
If N is odd, then M*N + N*(N-1)/2 is a multiple of N. It must be divisible by any factor of N, but if we choose M = P - (N-1)/2, then M*N + N*(N-1)/2 = P*N, so it isn't divisible by any other integers between 1 and N.
If N is even, then M*N + N*(N-1)/2 is a multiple of N/2. It must be divisible by any factor of N/2, but if we choose M = (P - N + 1)/2 (which must be an integer), then M*N + N*(N-1)/2 = (P - N + 1)*N/2 + (N-1)*N/2 = P*N/2, so it isn't divisible by any other integers between 1 and N.
The F series is defined as
F(0) = 1
F(1) = 1
F(i) = i * F(i - 1) * F(i - 2) for i > 1
The task is to find the number of different divisors for F(i)
This question is from Timus . I tried the following Python but it surely gives a time limit exceeded. This bruteforce approach will not work for a large input since it will cause integer overflow as well.
#!/usr/bin/env python
from math import sqrt
n = int(raw_input())
def f(n):
global arr
if n == 0:
return 1
if n == 1:
return 1
a = 1
b = 1
for i in xrange(2, n + 1):
k = i * a * b
a = b
b = k
return b
x = f(n)
cnt = 0
for i in xrange(1, int(sqrt(x)) + 1):
if x % i == 0:
if x / i == i:
cnt += 1
else:
cnt += 2
print cnt
Any optimization?
EDIT
I have tried the suggestion, and rewrite the solution: (not storing the F(n) value directly, but a list of factors)
#!/usr/bin/env python
#from math import sqrt
T = 10000
primes = range(T)
primes[0] = False
primes[1] = False
primes[2] = True
primes[3] = True
for i in xrange(T):
if primes[i]:
j = i + i
while j < T:
primes[j] = False
j += i
p = []
for i in xrange(T):
if primes[i]:
p.append(i)
n = int(raw_input())
def f(n):
global p
if n == 1:
return 1
a = dict()
b = dict()
for i in xrange(2, n + 1):
c = a.copy()
for y in b.iterkeys():
if c.has_key(y):
c[y] += b[y]
else:
c[y] = b[y]
k = i
for y in p:
d = 0
if k % y == 0:
while k % y == 0:
k /= y
d += 1
if c.has_key(y):
c[y] += d
else:
c[y] = d
if k < y: break
a = b
b = c
k = 1
for i in b.iterkeys():
k = k * (b[i] + 1) % (1000000007)
return k
print f(n)
And it still gives TL5, not faster enough, but this solves the problem of overflow for value F(n).
First see this wikipedia article on the divisor function. In short, if you have a number and you know its prime factors, you can easily calculate the number of divisors (get SO to do TeX math):
$n = \prod_{i=1}^r p_i^{a_i}$
$\sigma_x(n) = \prod_{i=1}^{r} \frac{p_{i}^{(a_{i}+1)x}-1}{p_{i}^x-1}$
Anyway, it's a simple function.
Now, to solve your problem, instead of keeping F(n) as the number itself, keep it as a set of prime factors and exponent sizes. Then the function that calculates F(n) simply takes the two sets for F(n-1) and F(n-2), sums the exponents of the same prime factors in both sets (assuming zero for nonexistent ones) and additionally adds the set of prime factors and exponent sizes for the number i. This means that you need another simple1 function to find the prime factors of i.
Computing F(n) this way, you just need to apply the above formula (taken from Wikipedia) to the set and there's your value. Note also that F(n) can quickly get very large. This solution also avoids usage of big-num libraries (since no prime factor nor its exponent is likely to go beyond 4 billion2).
1 Of course this is not so simple for arbitrarily large i, otherwise we wouldn't have any form of security right now, but for your application it should be simple enough.
2 Well it might. If you happen to figure out a simple formula answering your question given any n, then large ns would also be possible in the test case, for which this algorithm is likely going to give a time limit exceeded.
That is a fun problem.
The F(n) grow extremely fast. Since F(n) <= F(n+1) for all n, we have
F(n+2) > F(n)²
for all n, and thus
F(n) > 2^(2^(n/2-1))
for n > 2. That crude estimate already shows that one cannot store these numbers for any but the smallest n. By that F(100) requires more than (2^49) bits of storage, and 128 GB are only 2^40 bits. Actually, the prime factorisation of F(100) is
*Fiborial> fiborials !! 100
[(2,464855623252387472061),(3,184754360086075580988),(5,56806012190322167100)
,(7,20444417903078359662),(11,2894612619136622614),(13,1102203323977318975)
,(17,160545601976374531),(19,61312348893415199),(23,8944533909832252),(29,498454445374078)
,(31,190392553955142),(37,10610210054141),(41,1548008760101),(43,591286730489)
,(47,86267571285),(53,4807526976),(59,267914296),(61,102334155),(67,5702887),(71,832040)
,(73,317811),(79,17711),(83,2584),(89,144),(97,3)]
and that would require about 9.6 * 10^20 (roughly 2^70) bits - a little less than half of them are trailing zeros, but even storing the numbers à la floating point numbers with a significand and an exponent doesn't bring the required storage down far enough.
So instead of storing the numbers themselves, one can consider the prime factorisation. That also allows an easier computation of the number of divisors, since
k k
divisors(n) = ∏ (e_i + 1) if n = ∏ p_i^e_i
i=1 i=1
Now, let us investigate the prime factorisations of the F(n) a little. We begin with the
Lemma: A prime p divides F(n) if and only if p <= n.
That is easily proved by induction: F(0) = F(1) = 1 is not divisible by any prime, and there are no primes <= 1.
Now suppose that n > 1 and
A(k) = The prime factors of F(k) are exactly the primes <= k
holds for k < n. Then, since
F(n) = n * F(n-1) * F(n-2)
the set prime factors of F(n) is the union of the sets of prime factors of n, F(n-1) and F(n-2).
By the induction hypothesis, the set of prime factors of F(k) is
P(k) = { p | 1 < p <= k, p prime }
for k < n. Now, if n is composite, all prime factors of n are samller than n, hence the set of prime factors of F(n) is P(n-1), but since n is not prime, P(n) = P(n-1). If, on the other hand, n is prime, the set of prime factors of F(n) is
P(n-1) ∪ {n} = P(n)
With that, let us see how much work it is to track the prime factorisation of F(n) at once, and update the list/dictionary for each n (I ignore the problem of finding the factorisation of n, that doesn't take long for the small n involved).
The entry for the prime p appears first for n = p, and is then updated for each further n, altogether it is created/updated N - p + 1 times for F(N). Thus there are
∑ (N + 1 - p) = π(N)*(N+1) - ∑ p ≈ N²/(2*log N)
p <= N p <= N
updates in total. For N = 10^6, about 3.6 * 10^10 updates, that is way more than can be done in the allowed time (0.5 seconds).
So we need a different approach. Let us look at one prime p alone, and follow the exponent of p in the F(n).
Let v_p(k) be the exponent of p in the prime factorisation of k. Then we have
v_p(F(n)) = v_p(n) + v_p(F(n-1)) + v_p(F(n-2))
and we know that v_p(F(k)) = 0 for k < p. So (assuming p is not too small to understand what goes on):
v_p(F(n)) = v_p(n) + v_p(F(n-1)) + v_p(F(n-2))
v_p(F(p)) = 1 + 0 + 0 = 1
v_p(F(p+1)) = 0 + 1 + 0 = 1
v_p(F(p+2)) = 0 + 1 + 1 = 2
v_p(F(p+3)) = 0 + 2 + 1 = 3
v_p(F(p+4)) = 0 + 3 + 2 = 5
v_p(F(p+5)) = 0 + 5 + 3 = 8
So we get Fibonacci numbers for the exponents, v_p(F(p+k)) = Fib(k+1) - for a while, since later multiples of p inject further powers of p,
v_p(F(2*p-1)) = 0 + Fib(p-1) + Fib(p-2) = Fib(p)
v_p(F(2*p)) = 1 + Fib(p) + Fib(p-1) = 1 + Fib(p+1)
v_p(F(2*p+1)) = 0 + (1 + Fib(p+1)) + Fib(p) = 1 + Fib(p+2)
v_p(F(2*p+2)) = 0 + (1 + Fib(p+2)) + (1 + Fib(p+1)) = 2 + Fib(p+3)
v_p(F(2*p+3)) = 0 + (2 + Fib(p+3)) + (1 + Fib(p+2)) = 3 + Fib(p+4)
but the additional powers from 2*p also follow a nice Fibonacci pattern, and we have v_p(F(2*p+k)) = Fib(p+k+1) + Fib(k+1) for 0 <= k < p.
For further multiples of p, we get another Fibonacci summand in the exponent, so
n/p
v_p(F(n)) = ∑ Fib(n + 1 - k*p)
k=1
-- until n >= p², because multiples of p² contribute two to the exponent, and the corresponding summand would have to be multiplied by 2; for multiples of p³, by 3 etc.
One can also split the contributions of multiples of higher powers of p, so one would get one Fibonacci summand due to it being a multiple of p, one for it being a multiple of p², one for being a multiple of p³ etc, that yields
n/p n/p² n/p³
v_p(F(n)) = ∑ Fib(n + 1 - k*p) + ∑ Fib(n + 1 - k*p²) + ∑ Fib(n + 1 - k*p³) + ...
k=1 k=1 k=1
Now, in particular for the smaller primes, these sums have a lot of terms, and computing them that way would be slow. Fortunately, there is a closed formula for sums of Fibonacci numbers whose indices are an arithmetic progression, for 0 < a <= s
m
∑ Fib(a + k*s) = (Fib(a + (m+1)*s) - (-1)^s * Fib(a + m*s) - (-1)^a * Fib(s - a) - Fib(a)) / D(s)
k=0
where
D(s) = Luc(s) - 1 - (-1)^s
and Luc(k) is the k-th Lucas number, Luc(k) = Fib(k+1) + Fib(k-1).
For our purposes, we only need the Fibonacci numbers modulo 10^9 + 7, then the division must be replaced by a multiplication with the modular inverse of D(s).
Using these facts, the number of divisors of F(n) modulo 10^9+7 can be computed in the allowed time for n <= 10^6 (about 0.06 seconds on my old 32-bit box), although with Python, on the testing machines, further optimisations might be necessary.