Here are two algorithms (pseudo-code):
Alg1(n)
1. int x = n
2. int a = 0
3. while(x > 1) do
3.1. for i = 1 to x do
3.1.1 a = a + 1
3.2 x = x - n/5
Alg2(n)
1. int x = n
2. int a = 0
3. while(x > 1) do
3.1. for i = 1 to x do
3.1.1 a = a + 1
3.2 x = x/5
The difference is on line 3.2.
Time complexity:
Alg1: c + c +n*n*2c = 2c + 2cn² = 2c(n² + 1) = O(n²)
Alg2: c + c +n*n*2c = 2c + 2cn² = 2c(n² + 1) = O(n²)
I wanted to know if the calculation is correct.
Thanks!
No, I'm afraid you aren't correct.
In the first algorithm, the line:
x = x - n/5
Makes the while loop O(1) - it will run five times, however large n is. The for loop is O(N), so it's O(N) overall.
In algorithm 2, by contrast, x decreases as
x = x/5
As x = n to start with, this while loop runs in O(logN). However, the inner for loop also reduces as logN each time. Therefore you are carrying out n + n/5 + n/25 + ... operations, for O(N) again.
Below a formal method to deduce the order of growth of both your algorithms (I hope you're comfortable with Sigma Notation):
Algorithm 1 :
you decrease the value of x by 5 , so you do n + n-5 + n-10 +.... which is O(N^2)
Algorithm 2 :
you decrease the value of x by n/5 , so you do n + n/5 + n/25 +.... which is O(N logN)
see wikipedia for big-oh {O()} notation
Related
Computing complexity and Big o of an algorithm
T(n) = 5n log n + 3log n + 2 // to the base 2 big o = o(n log n)
for(int x = 0,i = 1;i <= N;i*=2)
{
for(int j = 1 ;j <= i ;j++)
{
x++;
}
}
The Big o expected was linear where as mine is logarithmic
Your Big-Oh analysis is not correct. While it is true that the outer loop is executed log n times, the inner loop is linear in i at each iteration.
If you count the total number of iterations of the inner loop, you will see that the whole thing is linear:
The inner loop will do 1 + 2 + 4 + 8 + 16 + ... + (the last power of 2 <= N) iterations. This sum will be between N and 2*N, which makes the whole loop linear.
Let me explain why your analysis is wrong.
It is clear that inner loop will execute 1 + 2 + 4 + ... + 2^k times where k is the biggest integer which satisfies equation . This implies that upper bound for k is
Without loss of generality we can take upper bound for k and assume that k is integer, complexity equals to 1 + 2 + 4 + ... + = which is geometric series so it is equal to
Therefore in O notation it is O(n)
First, you should notice that your analysis is not logarithmic! As N \log N is not logarithmic.
Also, the time complexity is T(n) = sum_{j = 0}^{log(n)} 2^j (as the value of i duplicated each time). Hence, T(n) = 2^(log(N) + 1) - 1 = 2N - 1 = \Theta(N).
so I got this algorithm I need to calculate its time complexity
which goes like
for i=1 to n do
k=i
while (k<=n) do
FLIP(A[k])
k = k + i
where A is an array of booleans, and FLIP is as it is, flipping the current value. therefore it's O(1).
Now I understand that the inner while loop should be called
n/1+n/2+n/3+...+n/n
If I'm correct, but is there a formula out there for such calculation?
pretty confused here
The more exact computation is T(n) \sum((n-i)/i) for i = 1 to n (because k is started from i). Hence, the final sum is n + n/2 + ... + n/n - n = n(1 + 1/2 + ... + 1/n) - n, approximately. We knew 1 + 1/2 + ... + 1/n = H(n) and H(n) = \Theta(\log(n)). Hence, T(n) = \Theta(n\log(n)). The -n has not any effect on the asymptotic computaional cost, as n = o(n\log(n)).
Lets say we want to calculate sum of this equation
n + n / 2 + n / 3 + ... + n / n
=> n ( 1 + 1 / 2 + 1 / 3 + ..... + 1 / n )
Then in bracket ( 1 + 1 / 2 + 1 / 3 + ... + 1 / n ) this is a well known Harmonic series and i am afraid there is no proven formula to calculate Harmonic series.
The given problem boils down to calculate below sum -Sum of harmonic series
Although this sum can't be calculated accurately, however you can still find asymptotic upper bound for this sum, which is approximately O(log(n)).
Hence answer to above problem will be - O(nlog(n))
I am given a formula f(n) where f(n) is defined, for all non-negative integers, as:
f(0) = 1
f(1) = 1
f(2) = 2
f(2n) = f(n) + f(n + 1) + n (for n > 1)
f(2n + 1) = f(n - 1) + f(n) + 1 (for n >= 1)
My goal is to find, for any given number s, the largest n where f(n) = s. If there is no such n return None. s can be up to 10^25.
I have a brute force solution using both recursion and dynamic programming, but neither is efficient enough. What concepts might help me find an efficient solution to this problem?
I want to add a little complexity analysis and estimate the size of f(n).
If you look at one recursive call of f(n), you notice, that the input n is basically divided by 2 before calling f(n) two times more, where always one call has an even and one has an odd input.
So the call tree is basically a binary tree where always the half of the nodes on a specific depth k provides a summand approx n/2k+1. The depth of the tree is log₂(n).
So the value of f(n) is in total about Θ(n/2 ⋅ log₂(n)).
Just to notice: This holds for even and odd inputs, but for even inputs the value is about an additional summand n/2 bigger. (I use Θ-notation to not have to think to much about some constants).
Now to the complexity:
Naive brute force
To calculate f(n) you have to call f(n) Θ(2log₂(n)) = Θ(n) times.
So if you want to calculate the values of f(n) until you reach s (or notice that there is no n with f(n)=s) you have to calculate f(n) s⋅log₂(s) times, which is in total Θ(s²⋅log(s)).
Dynamic programming
If you store every result of f(n), the time to calculate a f(n) reduces to Θ(1) (but it requires much more memory). So the total time complexity would reduce to Θ(s⋅log(s)).
Notice: Since we know f(n) ≤ f(n+2) for all n, you don't have to sort the values of f(n) and do a binary search.
Using binary search
Algorithm (input is s):
Set l = 1 and r = s
Set n = (l+r)/2 and round it to the next even number
calculate val = f(n).
if val == s then return n.
if val < s then set l = n
else set r = n.
goto 2
If you found a solution, fine. If not: try it again but round in step 2 to odd numbers. If this also does not return a solution, no solution exists at all.
This will take you Θ(log(s)) for the binary search and Θ(s) for the calculation of f(n) each time, so in total you get Θ(s⋅log(s)).
As you can see, this has the same complexity as the dynamic programming solution, but you don't have to save anything.
Notice: r = s does not hold for all s as an initial upper limit. However, if s is big enough, it holds. To be save, you can change the algorithm:
check first, if f(s) < s. If not, you can set l = s and r = 2s (or 2s+1 if it has to be odd).
Can you calculate the value of f(x) which x is from 0 to MAX_SIZE only once time?
what i mean is : calculate the value by DP.
f(0) = 1
f(1) = 1
f(2) = 2
f(3) = 3
f(4) = 7
f(5) = 4
... ...
f(MAX_SIZE) = ???
If the 1st step is illegal, exit. Otherwise, sort the value from small to big.
Such as 1,1,2,3,4,7,...
Now you can find whether exists n satisfied with f(n)=s in O(log(MAX_SIZE)) time.
Unfortunately, you don't mention how fast your algorithm should be. Perhaps you need to find some really clever rewrite of your formula to make it fast enough, in this case you might want to post this question on a mathematics forum.
The running time of your formula is O(n) for f(2n + 1) and O(n log n) for f(2n), according to the Master theorem, since:
T_even(n) = 2 * T(n / 2) + n / 2
T_odd(n) = 2 * T(n / 2) + 1
So the running time for the overall formula is O(n log n).
So if n is the answer to the problem, this algorithm would run in approx. O(n^2 log n), because you have to perform the formula roughly n times.
You can make this a little bit quicker by storing previous results, but of course, this is a tradeoff with memory.
Below is such a solution in Python.
D = {}
def f(n):
if n in D:
return D[n]
if n == 0 or n == 1:
return 1
if n == 2:
return 2
m = n // 2
if n % 2 == 0:
# f(2n) = f(n) + f(n + 1) + n (for n > 1)
y = f(m) + f(m + 1) + m
else:
# f(2n + 1) = f(n - 1) + f(n) + 1 (for n >= 1)
y = f(m - 1) + f(m) + 1
D[n] = y
return y
def find(s):
n = 0
y = 0
even_sol = None
while y < s:
y = f(n)
if y == s:
even_sol = n
break
n += 2
n = 1
y = 0
odd_sol = None
while y < s:
y = f(n)
if y == s:
odd_sol = n
break
n += 2
print(s,even_sol,odd_sol)
find(9992)
This recursive in every iteration for 2n and 2n+1 is increasing values, so if in any moment you will have value bigger, than s, then you can stop your algorithm.
To make effective algorithm you have to find or nice formula, that will calculate value, or make this in small loop, that will be much, much, much more effective, than your recursion. Your recursion is generally O(2^n), where loop is O(n).
This is how loop can be looking:
int[] values = new int[1000];
values[0] = 1;
values[1] = 1;
values[2] = 2;
for (int i = 3; i < values.length /2 - 1; i++) {
values[2 * i] = values[i] + values[i + 1] + i;
values[2 * i + 1] = values[i - 1] + values[i] + 1;
}
And inside this loop add condition of possible breaking it with success of failure.
What is the time complexity for the following function?
for(int i = 0; i < a.size; i++) {
for(int j = i; j < a.size; i++) {
//
}
}
I think it is less than big O n^2 because we arent iterating over all of the elements in the second for loop. I believe the time complexity comes out to be something like this:
n[ (n) + (n-1) + (n-2) + ... + (n-n) ]
But when I solve this formula it comes out to be
n^2 - n + n^2 - 2n + n^2 - 3n + ... + n^2 - n^2
Which doesn't seem correct at all. Can somebody tell me exactly how to solve this problem, and where I am wrong.
That is O(n^2). If you consider the iteration where i = a.size() - 1, and you work your way backwards (i = a.size() - 2, i = a.size - 3, etc), you are looking at the following sum of number of iterations, where n = a.size.
1 + 2 + 3 + 4 + ... + n
The sum of this series is n(n+1)/2, which is O(n^2). Note that big-O notation ignores constants and takes the highest polynomial power when it is applied to a polynomial function.
It will run for:
1 + 2 + 3 + .. + n
Which is 1/2 n(n+1) which give us O(n^2)
The Big-O notation will only keep the dominant term, neglecting constants too
The Big-O is only used to compare algorithms on the same variation of a problem using the same complexity analysis standard, if and only if the dominant terms are different.
If the dominant terms are the same, you need to compare Big-Theta or Time complexity, which will show minor differences.
Example
A
for i = 1 .. n
for j = i .. n
..
B
for i = 1 .. n
for j = 1 .. n
..
We have
Time(A) = 1/2 n(n+1) ~ O(n^2)
Time(B) = n^2 ~ O(n^2)
O(A) = O(B)
T(A) < T(B)
Analysis
To visualize how we got 1 + 2 + 3 + .. n:
for i = 1 .. n:
print "(1 + "
sum = 0
for j = i .. n:
sum++
print sum") + "
will print the following:
(1+n) + (1+(n-1)) + .. + (1+3) + (1+2) + (1+1) + (1+0)
n+1 + n + n-1 + .. + 3 + 2 + 1
1 + 2 + 3 + .. + n + n+1
1/2 n(n+1) + (n+1)
1/2 n^2 + 1/2 n + n + 1
1/2 n^2 + 3/2 n + 1
Yes, the number of iterations is strictly less than n^2, but it's still Θ(n^2). It will eventually be greater than n^k for any k<2, and it will eventually be less than n^k for any k>2.
(As a side note, computer scientists often say big-O when they really mean big-theta (Θ). It's technically correct to say that almost every algorithm you've seen has O(n!) running time; all reasonably algorithms have running times that grow no more quickly than n!. But it's not really useful to say that the complexity is O(n!) if it's also O(n log n), so by some kind of Gricean maxim we assume that when someone says an algorithm's complexiy is O(f(x)) that f(x) is as small as possible.)
The F series is defined as
F(0) = 1
F(1) = 1
F(i) = i * F(i - 1) * F(i - 2) for i > 1
The task is to find the number of different divisors for F(i)
This question is from Timus . I tried the following Python but it surely gives a time limit exceeded. This bruteforce approach will not work for a large input since it will cause integer overflow as well.
#!/usr/bin/env python
from math import sqrt
n = int(raw_input())
def f(n):
global arr
if n == 0:
return 1
if n == 1:
return 1
a = 1
b = 1
for i in xrange(2, n + 1):
k = i * a * b
a = b
b = k
return b
x = f(n)
cnt = 0
for i in xrange(1, int(sqrt(x)) + 1):
if x % i == 0:
if x / i == i:
cnt += 1
else:
cnt += 2
print cnt
Any optimization?
EDIT
I have tried the suggestion, and rewrite the solution: (not storing the F(n) value directly, but a list of factors)
#!/usr/bin/env python
#from math import sqrt
T = 10000
primes = range(T)
primes[0] = False
primes[1] = False
primes[2] = True
primes[3] = True
for i in xrange(T):
if primes[i]:
j = i + i
while j < T:
primes[j] = False
j += i
p = []
for i in xrange(T):
if primes[i]:
p.append(i)
n = int(raw_input())
def f(n):
global p
if n == 1:
return 1
a = dict()
b = dict()
for i in xrange(2, n + 1):
c = a.copy()
for y in b.iterkeys():
if c.has_key(y):
c[y] += b[y]
else:
c[y] = b[y]
k = i
for y in p:
d = 0
if k % y == 0:
while k % y == 0:
k /= y
d += 1
if c.has_key(y):
c[y] += d
else:
c[y] = d
if k < y: break
a = b
b = c
k = 1
for i in b.iterkeys():
k = k * (b[i] + 1) % (1000000007)
return k
print f(n)
And it still gives TL5, not faster enough, but this solves the problem of overflow for value F(n).
First see this wikipedia article on the divisor function. In short, if you have a number and you know its prime factors, you can easily calculate the number of divisors (get SO to do TeX math):
$n = \prod_{i=1}^r p_i^{a_i}$
$\sigma_x(n) = \prod_{i=1}^{r} \frac{p_{i}^{(a_{i}+1)x}-1}{p_{i}^x-1}$
Anyway, it's a simple function.
Now, to solve your problem, instead of keeping F(n) as the number itself, keep it as a set of prime factors and exponent sizes. Then the function that calculates F(n) simply takes the two sets for F(n-1) and F(n-2), sums the exponents of the same prime factors in both sets (assuming zero for nonexistent ones) and additionally adds the set of prime factors and exponent sizes for the number i. This means that you need another simple1 function to find the prime factors of i.
Computing F(n) this way, you just need to apply the above formula (taken from Wikipedia) to the set and there's your value. Note also that F(n) can quickly get very large. This solution also avoids usage of big-num libraries (since no prime factor nor its exponent is likely to go beyond 4 billion2).
1 Of course this is not so simple for arbitrarily large i, otherwise we wouldn't have any form of security right now, but for your application it should be simple enough.
2 Well it might. If you happen to figure out a simple formula answering your question given any n, then large ns would also be possible in the test case, for which this algorithm is likely going to give a time limit exceeded.
That is a fun problem.
The F(n) grow extremely fast. Since F(n) <= F(n+1) for all n, we have
F(n+2) > F(n)²
for all n, and thus
F(n) > 2^(2^(n/2-1))
for n > 2. That crude estimate already shows that one cannot store these numbers for any but the smallest n. By that F(100) requires more than (2^49) bits of storage, and 128 GB are only 2^40 bits. Actually, the prime factorisation of F(100) is
*Fiborial> fiborials !! 100
[(2,464855623252387472061),(3,184754360086075580988),(5,56806012190322167100)
,(7,20444417903078359662),(11,2894612619136622614),(13,1102203323977318975)
,(17,160545601976374531),(19,61312348893415199),(23,8944533909832252),(29,498454445374078)
,(31,190392553955142),(37,10610210054141),(41,1548008760101),(43,591286730489)
,(47,86267571285),(53,4807526976),(59,267914296),(61,102334155),(67,5702887),(71,832040)
,(73,317811),(79,17711),(83,2584),(89,144),(97,3)]
and that would require about 9.6 * 10^20 (roughly 2^70) bits - a little less than half of them are trailing zeros, but even storing the numbers à la floating point numbers with a significand and an exponent doesn't bring the required storage down far enough.
So instead of storing the numbers themselves, one can consider the prime factorisation. That also allows an easier computation of the number of divisors, since
k k
divisors(n) = ∏ (e_i + 1) if n = ∏ p_i^e_i
i=1 i=1
Now, let us investigate the prime factorisations of the F(n) a little. We begin with the
Lemma: A prime p divides F(n) if and only if p <= n.
That is easily proved by induction: F(0) = F(1) = 1 is not divisible by any prime, and there are no primes <= 1.
Now suppose that n > 1 and
A(k) = The prime factors of F(k) are exactly the primes <= k
holds for k < n. Then, since
F(n) = n * F(n-1) * F(n-2)
the set prime factors of F(n) is the union of the sets of prime factors of n, F(n-1) and F(n-2).
By the induction hypothesis, the set of prime factors of F(k) is
P(k) = { p | 1 < p <= k, p prime }
for k < n. Now, if n is composite, all prime factors of n are samller than n, hence the set of prime factors of F(n) is P(n-1), but since n is not prime, P(n) = P(n-1). If, on the other hand, n is prime, the set of prime factors of F(n) is
P(n-1) ∪ {n} = P(n)
With that, let us see how much work it is to track the prime factorisation of F(n) at once, and update the list/dictionary for each n (I ignore the problem of finding the factorisation of n, that doesn't take long for the small n involved).
The entry for the prime p appears first for n = p, and is then updated for each further n, altogether it is created/updated N - p + 1 times for F(N). Thus there are
∑ (N + 1 - p) = π(N)*(N+1) - ∑ p ≈ N²/(2*log N)
p <= N p <= N
updates in total. For N = 10^6, about 3.6 * 10^10 updates, that is way more than can be done in the allowed time (0.5 seconds).
So we need a different approach. Let us look at one prime p alone, and follow the exponent of p in the F(n).
Let v_p(k) be the exponent of p in the prime factorisation of k. Then we have
v_p(F(n)) = v_p(n) + v_p(F(n-1)) + v_p(F(n-2))
and we know that v_p(F(k)) = 0 for k < p. So (assuming p is not too small to understand what goes on):
v_p(F(n)) = v_p(n) + v_p(F(n-1)) + v_p(F(n-2))
v_p(F(p)) = 1 + 0 + 0 = 1
v_p(F(p+1)) = 0 + 1 + 0 = 1
v_p(F(p+2)) = 0 + 1 + 1 = 2
v_p(F(p+3)) = 0 + 2 + 1 = 3
v_p(F(p+4)) = 0 + 3 + 2 = 5
v_p(F(p+5)) = 0 + 5 + 3 = 8
So we get Fibonacci numbers for the exponents, v_p(F(p+k)) = Fib(k+1) - for a while, since later multiples of p inject further powers of p,
v_p(F(2*p-1)) = 0 + Fib(p-1) + Fib(p-2) = Fib(p)
v_p(F(2*p)) = 1 + Fib(p) + Fib(p-1) = 1 + Fib(p+1)
v_p(F(2*p+1)) = 0 + (1 + Fib(p+1)) + Fib(p) = 1 + Fib(p+2)
v_p(F(2*p+2)) = 0 + (1 + Fib(p+2)) + (1 + Fib(p+1)) = 2 + Fib(p+3)
v_p(F(2*p+3)) = 0 + (2 + Fib(p+3)) + (1 + Fib(p+2)) = 3 + Fib(p+4)
but the additional powers from 2*p also follow a nice Fibonacci pattern, and we have v_p(F(2*p+k)) = Fib(p+k+1) + Fib(k+1) for 0 <= k < p.
For further multiples of p, we get another Fibonacci summand in the exponent, so
n/p
v_p(F(n)) = ∑ Fib(n + 1 - k*p)
k=1
-- until n >= p², because multiples of p² contribute two to the exponent, and the corresponding summand would have to be multiplied by 2; for multiples of p³, by 3 etc.
One can also split the contributions of multiples of higher powers of p, so one would get one Fibonacci summand due to it being a multiple of p, one for it being a multiple of p², one for being a multiple of p³ etc, that yields
n/p n/p² n/p³
v_p(F(n)) = ∑ Fib(n + 1 - k*p) + ∑ Fib(n + 1 - k*p²) + ∑ Fib(n + 1 - k*p³) + ...
k=1 k=1 k=1
Now, in particular for the smaller primes, these sums have a lot of terms, and computing them that way would be slow. Fortunately, there is a closed formula for sums of Fibonacci numbers whose indices are an arithmetic progression, for 0 < a <= s
m
∑ Fib(a + k*s) = (Fib(a + (m+1)*s) - (-1)^s * Fib(a + m*s) - (-1)^a * Fib(s - a) - Fib(a)) / D(s)
k=0
where
D(s) = Luc(s) - 1 - (-1)^s
and Luc(k) is the k-th Lucas number, Luc(k) = Fib(k+1) + Fib(k-1).
For our purposes, we only need the Fibonacci numbers modulo 10^9 + 7, then the division must be replaced by a multiplication with the modular inverse of D(s).
Using these facts, the number of divisors of F(n) modulo 10^9+7 can be computed in the allowed time for n <= 10^6 (about 0.06 seconds on my old 32-bit box), although with Python, on the testing machines, further optimisations might be necessary.