Efficient Algorithm to Solve a Recursive Formula - performance

I am given a formula f(n) where f(n) is defined, for all non-negative integers, as:
f(0) = 1
f(1) = 1
f(2) = 2
f(2n) = f(n) + f(n + 1) + n (for n > 1)
f(2n + 1) = f(n - 1) + f(n) + 1 (for n >= 1)
My goal is to find, for any given number s, the largest n where f(n) = s. If there is no such n return None. s can be up to 10^25.
I have a brute force solution using both recursion and dynamic programming, but neither is efficient enough. What concepts might help me find an efficient solution to this problem?

I want to add a little complexity analysis and estimate the size of f(n).
If you look at one recursive call of f(n), you notice, that the input n is basically divided by 2 before calling f(n) two times more, where always one call has an even and one has an odd input.
So the call tree is basically a binary tree where always the half of the nodes on a specific depth k provides a summand approx n/2k+1. The depth of the tree is log₂(n).
So the value of f(n) is in total about Θ(n/2 ⋅ log₂(n)).
Just to notice: This holds for even and odd inputs, but for even inputs the value is about an additional summand n/2 bigger. (I use Θ-notation to not have to think to much about some constants).
Now to the complexity:
Naive brute force
To calculate f(n) you have to call f(n) Θ(2log₂(n)) = Θ(n) times.
So if you want to calculate the values of f(n) until you reach s (or notice that there is no n with f(n)=s) you have to calculate f(n) s⋅log₂(s) times, which is in total Θ(s²⋅log(s)).
Dynamic programming
If you store every result of f(n), the time to calculate a f(n) reduces to Θ(1) (but it requires much more memory). So the total time complexity would reduce to Θ(s⋅log(s)).
Notice: Since we know f(n) ≤ f(n+2) for all n, you don't have to sort the values of f(n) and do a binary search.
Using binary search
Algorithm (input is s):
Set l = 1 and r = s
Set n = (l+r)/2 and round it to the next even number
calculate val = f(n).
if val == s then return n.
if val < s then set l = n
else set r = n.
goto 2
If you found a solution, fine. If not: try it again but round in step 2 to odd numbers. If this also does not return a solution, no solution exists at all.
This will take you Θ(log(s)) for the binary search and Θ(s) for the calculation of f(n) each time, so in total you get Θ(s⋅log(s)).
As you can see, this has the same complexity as the dynamic programming solution, but you don't have to save anything.
Notice: r = s does not hold for all s as an initial upper limit. However, if s is big enough, it holds. To be save, you can change the algorithm:
check first, if f(s) < s. If not, you can set l = s and r = 2s (or 2s+1 if it has to be odd).

Can you calculate the value of f(x) which x is from 0 to MAX_SIZE only once time?
what i mean is : calculate the value by DP.
f(0) = 1
f(1) = 1
f(2) = 2
f(3) = 3
f(4) = 7
f(5) = 4
... ...
f(MAX_SIZE) = ???
If the 1st step is illegal, exit. Otherwise, sort the value from small to big.
Such as 1,1,2,3,4,7,...
Now you can find whether exists n satisfied with f(n)=s in O(log(MAX_SIZE)) time.

Unfortunately, you don't mention how fast your algorithm should be. Perhaps you need to find some really clever rewrite of your formula to make it fast enough, in this case you might want to post this question on a mathematics forum.
The running time of your formula is O(n) for f(2n + 1) and O(n log n) for f(2n), according to the Master theorem, since:
T_even(n) = 2 * T(n / 2) + n / 2
T_odd(n) = 2 * T(n / 2) + 1
So the running time for the overall formula is O(n log n).
So if n is the answer to the problem, this algorithm would run in approx. O(n^2 log n), because you have to perform the formula roughly n times.
You can make this a little bit quicker by storing previous results, but of course, this is a tradeoff with memory.
Below is such a solution in Python.
D = {}
def f(n):
if n in D:
return D[n]
if n == 0 or n == 1:
return 1
if n == 2:
return 2
m = n // 2
if n % 2 == 0:
# f(2n) = f(n) + f(n + 1) + n (for n > 1)
y = f(m) + f(m + 1) + m
else:
# f(2n + 1) = f(n - 1) + f(n) + 1 (for n >= 1)
y = f(m - 1) + f(m) + 1
D[n] = y
return y
def find(s):
n = 0
y = 0
even_sol = None
while y < s:
y = f(n)
if y == s:
even_sol = n
break
n += 2
n = 1
y = 0
odd_sol = None
while y < s:
y = f(n)
if y == s:
odd_sol = n
break
n += 2
print(s,even_sol,odd_sol)
find(9992)

This recursive in every iteration for 2n and 2n+1 is increasing values, so if in any moment you will have value bigger, than s, then you can stop your algorithm.
To make effective algorithm you have to find or nice formula, that will calculate value, or make this in small loop, that will be much, much, much more effective, than your recursion. Your recursion is generally O(2^n), where loop is O(n).
This is how loop can be looking:
int[] values = new int[1000];
values[0] = 1;
values[1] = 1;
values[2] = 2;
for (int i = 3; i < values.length /2 - 1; i++) {
values[2 * i] = values[i] + values[i + 1] + i;
values[2 * i + 1] = values[i - 1] + values[i] + 1;
}
And inside this loop add condition of possible breaking it with success of failure.

Related

Algorithm Analysis: Expected Running Time of Recursive Function Based on a RNG

I am somewhat confused with the running time analysis of a program here which has recursive calls which depend on a RNG. (Randomly Generated Number)
Let's begin with the pseudo-code, and then I will go into what I have thought about so far related to this one.
Func1(A, i, j)
/* A is an array of at least j integers */
1 if (i ≥ j) then return (0);
2 n ← j − i + 1 ; /* n = number of elements from i to j */
3 k ← Random(n);
4 s ← 0; //Takes time of Arbitrary C
5 for r ← i to j do
6 A[r] ← A[r] − A[i] − A[j]; //Arbitrary C
7 s ← s + A[r]; //Arbitrary C
8 end
9 s ← s + Func1(A, i, i+k-1); //Recursive Call 1
10 s ← s + Func1(A, i+k, j); //Recursive Call 2
11 return (s);
Okay, now let's get into the math I have tried so far. I'll try not to be too pedantic here as it is just a rough, estimated analysis of expected run time.
First, let's consider the worst case. Note that the K = Random(n) must be at least 1, and at most n. Therefore, the worst case is the K = 1 is picked. This causes the total running time to be equal to T(n) = cn + T(1) + T(n-1). Which means that overall it takes somewhere around cn^2 time total (you can use Wolfram to solve recurrence relations if you are stuck or rusty on recurrence relations, although this one is a fairly simple one).
Now, here is where I get somewhat confused. For the expected running time, we have to base our assumption off of the probability of the random number K. Therefore, we have to sum all the possible running times for different values of k, plus their individual probability. By lemma/hopefully intuitive logic: the probability of any one Randomly Generated k, with k between 1 to n, is equal 1/n.
Therefore, (in my opinion/analysis) the expected run time is:
ET(n) = cn + (1/n)*Summation(from k=1 to n-1) of (ET(k-1) + ET(n-k))
Let me explain a bit. The cn is simply for the loop which runs i to j. This is estimated by cn. The summation represents all of the possible values for k. The (1/n) multiplied by this summation is there because the probability of any one k is (1/n). The terms inside the summation represent the running times of the recursive calls of Func1. The first term on the left takes ET(k-1) because this recursive call is going to do a loop from i to k-1 (which is roughly ck), and then possibly call Func1 again. The second is a representation of the second recursive call, which would loop from i+k to j, which is also represented by n-k.
Upon expansion of the summation, we see that the overall function ET(n) is of the order n^2. However, as a test case, plugging in k=(n/2) gives a total running time for Func 1 of roughly nlog(n). This is why I am confused. How can this be, if the estimated running time is of the order n^2? Am I considering a "good" case by plugging in n/2 for k? Or am I thinking about k in the wrong sense in some way?
Expected time complexity is ET(n) = O(nlogn) . Following is math proof derived by myself please tell if any error :-
ET(n) = P(k=1)*(ET(1)+ET(n-1)) + P(k=2)*(ET(2)+ET(n-2)).......P(k=n-1)*(ET(n-1)+ET(1)) + c*n
As the RNG is uniformly random P(k=x) = 1/n for all x
hence ET(n) = 1/n*(ET(1)*2+ET(2)*2....ET(n-1)*2) + c*n
ET(n) = 2/n*sum(ET(i)) + c*n i in (1,n-1)
ET(n-1) = 2/(n-1)*sum(ET(i)) + c*(n-1) i in (1,n-2)
sum(ET(i)) i in (1,n-2) = (ET(n-1)-c*(n-1))*(n-1)/2
ET(n) = 2/n*(sum(ET(i)) in (1,n-2) + ET(n-1)) + c*n
ET(n) = 2/n*((ET(n-1)-c*(n-1))*(n-1)/2+ET(n-1)) + c*n
ET(n) = 2/n*((n+1)/2*ET(n-1) - c*(n-1)*(n-1)/2) + c*n
ET(n) = (n+1)/n*ET(n-1) + c*n - c*(n-1)*(n-1)/n
ET(n) = (n+1)/n*ET(n-1) + c
solving recurrence
ET(n) = (n+1)ET(1) + c + (n+1)/n*c + (n+1)/(n-1)*c + (n+1)/(n-2)*c.....
ET(n) = (n+1) + c + (n+1)*sum(1/i) i in (1,n)
sum(1/i) i in (1,n) = O(logn)
ET(n) = (n+1) + c + (n+1)*logn
ET(n) = O(nlogn)

complexity of a randomized search algorithm

Consider the following randomized search algorithm on a sorted array a of length n (in increasing order). x can be any element of the array.
size_t randomized_search(value_t a[], size_t n, value_t x)
size_t l = 0;
size_t r = n - 1;
while (true) {
size_t j = rand_between(l, r);
if (a[j] == x) return j;
if (a[j] < x) l = j + 1;
if (a[j] > x) r = j - 1;
}
}
What is the expectation value of the Big Theta complexity (bounded both below and above) of this function when x is selected randomly from a?
Although this seems to be log(n), I carried out an experiment with instruction count, and found out that the result grows a little faster than log(n) (according to my data, even (log(n))^1.1 better fit the result).
Someone told me that this algorithm has an exact big theta complexity (so obviously log(n)^1.1 is not the answer). So, could you please give the time complexity along with your approach to prove it? Thanks.
Update: the data from my experiment
log(n) fit result by mathematica:
log(n)^1.1 fit result:
If you're willing to switch to counting three-way compares, I can tell you the exact complexity.
Suppose that the key is at position i, and I want to know the expected number of compares with position j. I claim that position j is examined if and only if it's the first position between i and j inclusive to be examined. Since the pivot element is selected uniformly at random each time, this happens with probability 1/(|i - j| + 1).
The total complexity is the expectation over i <- {1, ..., n} of sum_{j=1}^n 1/(|i - j| + 1), which is
sum_{i=1}^n 1/n sum_{j=1}^n 1/(|i - j| + 1)
= 1/n sum_{i=1}^n (sum_{j=1}^i 1/(i - j + 1) + sum_{j=i+1}^n 1/(j - i + 1))
= 1/n sum_{i=1}^n (H(i) + H(n + 1 - i) - 1)
= 1/n sum_{i=1}^n H(i) + 1/n sum_{i=1}^n H(n + 1 - i) - 1
= 1/n sum_{i=1}^n H(i) + 1/n sum_{k=1}^n H(k) - 1 (k = n + 1 - i)
= 2 H(n + 1) - 3 + 2 H(n + 1)/n - 2/n
= 2 H(n + 1) - 3 + O(log n / n)
= 2 log n + O(1)
= Theta(log n).
(log means natural log here.) Note the -3 in the low order terms. This makes it look like the number of compares is growing faster than logarithmic at the beginning, but the asymptotic behavior dictates that it levels off. Try excluding small n and refitting your curves.
Assuming rand_between to implement sampling from a uniform probability distribution in constant time, the expected running time of this algorithm is Θ(lg n). Informal sketch of a proof: the expected value of rand_between(l, r) is (l+r)/2, the midpoint between them. So each iteration is expected to skip half of the array (assuming the size is a power of two), just like a single iteration of binary search would.
More formally, borrowing from an analysis of quickselect, observe that when you pick a random midpoint, half of the time it will be between ¼n and ¾n. Neither the left nor the right subarray has more than ¾n elements. The other half of the time, neither has more than n elements (obviously). That leads to a recurrence relation
T(n) = ½T(¾n) + ½T(n) + f(n)
where f(n) is the amount of work in each iteration. Subtracting ½T(n) from both sides, then doubling both sides, we have
½T(n) = ½T(¾n) + f(n)
T(n) = T(¾n) + 2f(n)
Now, since 2f(n) = Θ(1) = Θ(n ᶜ log⁰ n) where c = log(1) = 0, it follows by the master theorem that T(n) = Θ(n⁰ lg n) = Θ(lg n).

Time complexity of the program using recurrence equation

I want to find out the time complexity of the program using recurrence equations.
That is ..
int f(int x)
{
if(x<1) return 1;
else return f(x-1)+g(x);
}
int g(int x)
{
if(x<2) return 1;
else return f(x-1)+g(x/2);
}
I write its recurrence equation and tried to solve it but it keep on getting complex
T(n) =T(n-1)+g(n)+c
=T(n-2)+g(n-1)+g(n)+c+c
=T(n-3)+g(n-2)+g(n-1)+g(n)+c+c+c
=T(n-4)+g(n-3)+g(n-2)+g(n-1)+g(n)+c+c+c+c
……………………….
……………………..
Kth time …..
=kc+g(n)+g(n-1)+g(n-3)+g(n-4).. .. . … +T(n-k)
Let at kth time input become 1
Then n-k=1
K=n-1
Now i end up with this..
T(n)= (n-1)c+g(n)+g(n-1)+g(n-2)+g(n-3)+….. .. g(1)
I ‘m not able to solve it further.
Any way if we count the number of function calls in this program , it can be easily seen that time complexity is exponential but I want proof it using recurrence . how can it be done ?
Explanation in Anwer 1, looks correct , similar work I did.
The most difficult task in this code is to write its recursion equation. I have drawn another diagram , I identified some patterns , I think we can get some help form this diagram what could be the possible recurrence equation.
And I came up with this equation , not sure if it is right ??? Please help.
T(n) = 2*T(n-1) + c * logn
Ok, I think I have been able to prove that f(x) = Theta(2^x) (note that the time complexity is the same). This also proves that g(x) = Theta(2^x) as f(x) > g(x) > f(x-1).
First as everyone noted, it is easy to prove that f(x) = Omega(2^x).
Now we have the relation that f(x) <= 2 f(x-1) + f(x/2) (since f(x) > g(x))
We will show that, for sufficiently large x, there is some constant K > 0 such that
f(x) <= K*H(x), where H(x) = (2 + 1/x)^x
This implies that f(x) = Theta(2^x), as H(x) = Theta(2^x), which itself follows from the fact that H(x)/2^x -> sqrt(e) as x-> infinity (wolfram alpha link of the limit).
Now (warning: heavier math, perhap cs.stackexchange or math.stackexchange is better suited)
according to wolfram alpha (click the link and see series expansion near x = infinity),
H(x) = exp(x ln(2) + 1/2 + O(1/x))
And again, according to wolfram alpha (click the link (different from above) and see the series expansion for x = infinity), we have that
H(x) - 2H(x-1) = [1/2x + O(1/x^2)]exp(x ln(2) + 1/2 + O(1/x))
and so
[H(x) - 2H(x-1)]/H(x/2) -> infinity as x -> infinity
Thus, for sufficiently large x (say x > L) we have the inequality
H(x) >= 2H(x-1) + H(x/2)
Now there is some K (dependent only on L (for instance K = f(2L))) such that
f(x) <= K*H(x) for all x <= 2L
Now we proceed by (strong) induction (you can revert to natural numbers if you want to)
f(x+1) <= 2f(x) + f((x+1)/2)
By induction, the right side is
<= 2*K*H(x) + K*H((x+1)/2)
And we proved earlier that
2*H(x) + H((x+1)/2) <= H(x+1)
Thus f(x+1) <= K * H(x+1)
Using memoisation, both functions can easily be computed in O(n) time. But the program takes at least O(2^n) time, and thus is a very inefficient way of computing f(n) and g(n)
To prove that the program takes at most O(2+epsilon)^n time for any epsilon > 0:
Let F(n) and G(n) be the number of function calls that are made in evaluating f(n) and g(n), respectively. Clearly (counting the addition as 1 function call):
F(0) = 1; F(n) = F(n-1) + G(n) + 1
G(1) = 1; G(n) = F(n-1) + G(n/2) + 1
Then one can prove:
F and G are monotonic
F > G
Define H(1) = 2; H(n) = 2 * H(n-1) + H(n/2) + 1
clearly, H > F
for all n, H(n) > 2 * H(n-1)
hence H(n/2) / H(n-1) -> 0 for sufficiently large n
hence H(n) < (2 + epsilon) * H(n-1) for all epsilon > 0 and sufficiently large n
hence H in O((2 + epsilon)^n) for any epsilon > 0
(Edit: originally I concluded here that the upper bound is O(2^n). That is incorrect,as nhahtdh pointed out, but see below)
so this is the best I can prove.... Because G < F < H they are also in O((2 + epsilon)^n) for any epsilon > 0
Postscript (after seeing Mr Knoothes solution): Because i.m.h.o a good mathematical proof gives insight, rather than lots of formulas, and SO exists for all those future generations (hi gals!):
For many algorithms, calculating f(n+1) involves twice (thrice,..) the amount of work for f(n), plus something more. If this something more becomes relatively less with increasing n (which is often the case) using a fixed epsilon like above is not optimal.
Replacing the epsilon above by some decreasing function ε(n) of n will in many cases (if ε decreases fast enough, say ε(n)=1/n) yield an upper bound O((2 + ε(n))^n ) = O(2^n)
Let f(0)=0 and g(0)=0
From the function we have,
f(x) = f(x - 1) + g(x)
g(x) = f(x - 1) + g(x/2)
Substituting g(x) in f(x) we get,
f(x) = f(x-1) + f(x -1) + g(x/2)
∴f(x) = 2f(x-1) + g(x/2)
Expanding this we get,
f(x) = 2f(x-1)+f(x/2-1)+f(x/4-1)+ ... + f(1)
Let s(x) be a function defined as follows,
s(x) = 2s(x-1)
Now clearly f(x)=Ω(s(x)).
The complexity of s(x) is O(2x).
Therefore function f(x)=Ω(2x).
I think is clear to see that f(n) > 2n, because f(n) > h(n) = 2h(n-1) = 2n.
Now I claim that for every n, there is an ε such that:
f(n) < (2+ε)n, to see this, let do it by induction, but to make it more sensible at first I'll use ε = 1, to show f(n) <= 3n, then I'll extend it.
We will use strong induction, suppose for every m < n, f(m) < 3m then we have:
f(n) = 2[f(n-1) + f(n/2 -1) + f(n/4 -1)+ ... +f(1-1)]
but for this part:
A = f(n/2 -1) + f(n/4 -1)+ ... +f(1-1)
we have:
f(n/2) = 2[f(n/2 -1) + f(n/4 -1)+ ... +f(1-1]) ==>
A <= f(n/2) [1]
So we can rewrite f(n):
f(n) = 2f(n-1) + A < 2f(n-1) +f(n/2),
Now let back to our claim:
f(n) < 2*3^(n-1) + 2*3^(n/2)==>
f(n) < 2*3^(n-1) + 3^(n-1) ==>
f(n) < 3^n. [2]
By [2], proof of f(n)&in;O(3n) is completed.
But If you want to extend this to the format of (2+ε)n, just use 1 to replace the inequality, then we will have
for ε > 1/(2+ε)n/2-1 → f(n) < (2+ε)n.[3]
Also by [3] you can say that for every n there is an ε such that f(n) < (2+ε)n actually there is constant ε such that for n > n0, f(n)&in;O((2+ε)n). [4]
Now we can use wolfarmalpha like #Knoothe, by setting ε=1/n, then we will have:
f(n) < (2+1/n)n which results on f(n) < e*2n, and by our simple lower bound at start we have: f(n)&in; Θ(2^n).[5]
P.S: I didn't calculate epsilon exactly, but you can do it with pen and paper simply, I think this epsilon is not correct, but is easy to find it, and if is hard tell me is hard, and I'll write it.

Any faster algorithm to compute the number of divisors

The F series is defined as
F(0) = 1
F(1) = 1
F(i) = i * F(i - 1) * F(i - 2) for i > 1
The task is to find the number of different divisors for F(i)
This question is from Timus . I tried the following Python but it surely gives a time limit exceeded. This bruteforce approach will not work for a large input since it will cause integer overflow as well.
#!/usr/bin/env python
from math import sqrt
n = int(raw_input())
def f(n):
global arr
if n == 0:
return 1
if n == 1:
return 1
a = 1
b = 1
for i in xrange(2, n + 1):
k = i * a * b
a = b
b = k
return b
x = f(n)
cnt = 0
for i in xrange(1, int(sqrt(x)) + 1):
if x % i == 0:
if x / i == i:
cnt += 1
else:
cnt += 2
print cnt
Any optimization?
EDIT
I have tried the suggestion, and rewrite the solution: (not storing the F(n) value directly, but a list of factors)
#!/usr/bin/env python
#from math import sqrt
T = 10000
primes = range(T)
primes[0] = False
primes[1] = False
primes[2] = True
primes[3] = True
for i in xrange(T):
if primes[i]:
j = i + i
while j < T:
primes[j] = False
j += i
p = []
for i in xrange(T):
if primes[i]:
p.append(i)
n = int(raw_input())
def f(n):
global p
if n == 1:
return 1
a = dict()
b = dict()
for i in xrange(2, n + 1):
c = a.copy()
for y in b.iterkeys():
if c.has_key(y):
c[y] += b[y]
else:
c[y] = b[y]
k = i
for y in p:
d = 0
if k % y == 0:
while k % y == 0:
k /= y
d += 1
if c.has_key(y):
c[y] += d
else:
c[y] = d
if k < y: break
a = b
b = c
k = 1
for i in b.iterkeys():
k = k * (b[i] + 1) % (1000000007)
return k
print f(n)
And it still gives TL5, not faster enough, but this solves the problem of overflow for value F(n).
First see this wikipedia article on the divisor function. In short, if you have a number and you know its prime factors, you can easily calculate the number of divisors (get SO to do TeX math):
$n = \prod_{i=1}^r p_i^{a_i}$
$\sigma_x(n) = \prod_{i=1}^{r} \frac{p_{i}^{(a_{i}+1)x}-1}{p_{i}^x-1}$
Anyway, it's a simple function.
Now, to solve your problem, instead of keeping F(n) as the number itself, keep it as a set of prime factors and exponent sizes. Then the function that calculates F(n) simply takes the two sets for F(n-1) and F(n-2), sums the exponents of the same prime factors in both sets (assuming zero for nonexistent ones) and additionally adds the set of prime factors and exponent sizes for the number i. This means that you need another simple1 function to find the prime factors of i.
Computing F(n) this way, you just need to apply the above formula (taken from Wikipedia) to the set and there's your value. Note also that F(n) can quickly get very large. This solution also avoids usage of big-num libraries (since no prime factor nor its exponent is likely to go beyond 4 billion2).
1 Of course this is not so simple for arbitrarily large i, otherwise we wouldn't have any form of security right now, but for your application it should be simple enough.
2 Well it might. If you happen to figure out a simple formula answering your question given any n, then large ns would also be possible in the test case, for which this algorithm is likely going to give a time limit exceeded.
That is a fun problem.
The F(n) grow extremely fast. Since F(n) <= F(n+1) for all n, we have
F(n+2) > F(n)²
for all n, and thus
F(n) > 2^(2^(n/2-1))
for n > 2. That crude estimate already shows that one cannot store these numbers for any but the smallest n. By that F(100) requires more than (2^49) bits of storage, and 128 GB are only 2^40 bits. Actually, the prime factorisation of F(100) is
*Fiborial> fiborials !! 100
[(2,464855623252387472061),(3,184754360086075580988),(5,56806012190322167100)
,(7,20444417903078359662),(11,2894612619136622614),(13,1102203323977318975)
,(17,160545601976374531),(19,61312348893415199),(23,8944533909832252),(29,498454445374078)
,(31,190392553955142),(37,10610210054141),(41,1548008760101),(43,591286730489)
,(47,86267571285),(53,4807526976),(59,267914296),(61,102334155),(67,5702887),(71,832040)
,(73,317811),(79,17711),(83,2584),(89,144),(97,3)]
and that would require about 9.6 * 10^20 (roughly 2^70) bits - a little less than half of them are trailing zeros, but even storing the numbers à la floating point numbers with a significand and an exponent doesn't bring the required storage down far enough.
So instead of storing the numbers themselves, one can consider the prime factorisation. That also allows an easier computation of the number of divisors, since
k k
divisors(n) = ∏ (e_i + 1) if n = ∏ p_i^e_i
i=1 i=1
Now, let us investigate the prime factorisations of the F(n) a little. We begin with the
Lemma: A prime p divides F(n) if and only if p <= n.
That is easily proved by induction: F(0) = F(1) = 1 is not divisible by any prime, and there are no primes <= 1.
Now suppose that n > 1 and
A(k) = The prime factors of F(k) are exactly the primes <= k
holds for k < n. Then, since
F(n) = n * F(n-1) * F(n-2)
the set prime factors of F(n) is the union of the sets of prime factors of n, F(n-1) and F(n-2).
By the induction hypothesis, the set of prime factors of F(k) is
P(k) = { p | 1 < p <= k, p prime }
for k < n. Now, if n is composite, all prime factors of n are samller than n, hence the set of prime factors of F(n) is P(n-1), but since n is not prime, P(n) = P(n-1). If, on the other hand, n is prime, the set of prime factors of F(n) is
P(n-1) ∪ {n} = P(n)
With that, let us see how much work it is to track the prime factorisation of F(n) at once, and update the list/dictionary for each n (I ignore the problem of finding the factorisation of n, that doesn't take long for the small n involved).
The entry for the prime p appears first for n = p, and is then updated for each further n, altogether it is created/updated N - p + 1 times for F(N). Thus there are
∑ (N + 1 - p) = π(N)*(N+1) - ∑ p ≈ N²/(2*log N)
p <= N p <= N
updates in total. For N = 10^6, about 3.6 * 10^10 updates, that is way more than can be done in the allowed time (0.5 seconds).
So we need a different approach. Let us look at one prime p alone, and follow the exponent of p in the F(n).
Let v_p(k) be the exponent of p in the prime factorisation of k. Then we have
v_p(F(n)) = v_p(n) + v_p(F(n-1)) + v_p(F(n-2))
and we know that v_p(F(k)) = 0 for k < p. So (assuming p is not too small to understand what goes on):
v_p(F(n)) = v_p(n) + v_p(F(n-1)) + v_p(F(n-2))
v_p(F(p)) = 1 + 0 + 0 = 1
v_p(F(p+1)) = 0 + 1 + 0 = 1
v_p(F(p+2)) = 0 + 1 + 1 = 2
v_p(F(p+3)) = 0 + 2 + 1 = 3
v_p(F(p+4)) = 0 + 3 + 2 = 5
v_p(F(p+5)) = 0 + 5 + 3 = 8
So we get Fibonacci numbers for the exponents, v_p(F(p+k)) = Fib(k+1) - for a while, since later multiples of p inject further powers of p,
v_p(F(2*p-1)) = 0 + Fib(p-1) + Fib(p-2) = Fib(p)
v_p(F(2*p)) = 1 + Fib(p) + Fib(p-1) = 1 + Fib(p+1)
v_p(F(2*p+1)) = 0 + (1 + Fib(p+1)) + Fib(p) = 1 + Fib(p+2)
v_p(F(2*p+2)) = 0 + (1 + Fib(p+2)) + (1 + Fib(p+1)) = 2 + Fib(p+3)
v_p(F(2*p+3)) = 0 + (2 + Fib(p+3)) + (1 + Fib(p+2)) = 3 + Fib(p+4)
but the additional powers from 2*p also follow a nice Fibonacci pattern, and we have v_p(F(2*p+k)) = Fib(p+k+1) + Fib(k+1) for 0 <= k < p.
For further multiples of p, we get another Fibonacci summand in the exponent, so
n/p
v_p(F(n)) = ∑ Fib(n + 1 - k*p)
k=1
-- until n >= p², because multiples of p² contribute two to the exponent, and the corresponding summand would have to be multiplied by 2; for multiples of p³, by 3 etc.
One can also split the contributions of multiples of higher powers of p, so one would get one Fibonacci summand due to it being a multiple of p, one for it being a multiple of p², one for being a multiple of p³ etc, that yields
n/p n/p² n/p³
v_p(F(n)) = ∑ Fib(n + 1 - k*p) + ∑ Fib(n + 1 - k*p²) + ∑ Fib(n + 1 - k*p³) + ...
k=1 k=1 k=1
Now, in particular for the smaller primes, these sums have a lot of terms, and computing them that way would be slow. Fortunately, there is a closed formula for sums of Fibonacci numbers whose indices are an arithmetic progression, for 0 < a <= s
m
∑ Fib(a + k*s) = (Fib(a + (m+1)*s) - (-1)^s * Fib(a + m*s) - (-1)^a * Fib(s - a) - Fib(a)) / D(s)
k=0
where
D(s) = Luc(s) - 1 - (-1)^s
and Luc(k) is the k-th Lucas number, Luc(k) = Fib(k+1) + Fib(k-1).
For our purposes, we only need the Fibonacci numbers modulo 10^9 + 7, then the division must be replaced by a multiplication with the modular inverse of D(s).
Using these facts, the number of divisors of F(n) modulo 10^9+7 can be computed in the allowed time for n <= 10^6 (about 0.06 seconds on my old 32-bit box), although with Python, on the testing machines, further optimisations might be necessary.

Running Time of Divide And conquer fibonacci program

count = 0
def fibonacci(n):
global count
count = count + 1
if not isinstance(n, int):
print ('Invalid Input')
return None
if n < 0:
print ('Invalid Input')
return None
if n == 0:
return 0
if n == 1:
return 1
fib = fibonacci(n-1) + fibonacci(n-2)
return fib
fibonacci(8)
print(count)
I was trying to find out the running time of this fibonacci program. Can any one help me in solving the recurrence relation for the same..
T(n) = T(n-1) + T(n-2)...What would be the running time calculation from here?
Thanks... :)
I am assuming you meant 'fibonacci' where you said 'factorial'.
At each level, you have two calls to fibonacci(). This means your running time will be O(2^n). You can see this by drawing the recursion tree.
For a much better and more detailed explanation, please see Computational complexity of Fibonacci Sequence.
you can see wiki,
But simple observation As you wrote:
T(n) < 2T(n-1) = 2 * 2 T(n-2) =.... = 2^(n-1)T(1) = 2^(n-1). So T(n) is in O(2^n).
in fact you should solve x^2 = X + 1 so x will be phi1 = (1+sqrt(5))/2 or phi2 = (1-sqrt(5))/2 so result is phi1 ^ n + phi2 ^n but because phi2 is smaller than 1 for big n we can say it's T(n)=phi1^n.
Edit:*But you can edit your current solution to take O(n) running time(by for loop start from first element).
Take a look at this especially time.clock(). Call clock before your function call and after, calculate the difference and you got the elapsed time.
Btw: Why so much code for fibonacci?
def fib (n): return fib (n - 1) + fib (n - 2) if n > 1 else n
The runtime is 2F(n+1) - 1 calls, where n is the nth Fibonacci number.
Here's a quick inductive proof:
As a base case, if n = 0 or n = 1, then we make exactly one call, and F(1) = F(2) = 1, and we have that 2F(n+1) - 1 = 1.
For the inductive step, if n > 1, then we make as many calls as are necessary to evaluate the function on n-1 and n-2. By the inductive hypothesis, this takes 2F(n) - 1 + 2F(n-1) - 1 = 2F(n+1) - 2 recursive calls to complete. However, because we count the current function call as well, we add one to this to get 2F(n+1) - 1 as required.
Note that 2F(n+1) - 1 is an expression for the nth Leonardo number, where
L(0) = L(1) = 1
L(n+2) = L(n) + L(n+1) + 1
Which grows at Θ(Φn) as Saeed points out. However, this answer is mathematically exact.
This is more accurately the runtime you're interested in, since you need to account for the work being done in each recursive call itself. If you leave off the +1 term, you just get bac the Fibonacci series!

Resources