I am trying to find the Big O for stooge sort. From Wikipedia
algorithm stoogesort(array L, i = 0, j = length(L)-1)
if L[j] < L[i] then
L[i] ↔ L[j]
if j - i > 1 then
t = (j - i + 1)/3
stoogesort(L, i , j-t)
stoogesort(L, i+t, j )
stoogesort(L, i , j-t)
return L
I am bad at performance analysis ... I drew the recursion tree
I believe the ... :
height: log(n)
work on level 0: n // do I start from level 0 or 1?
work on level 1: 2n
work on level 2: 4n
work on level 3: 8n
work on level log(n): (2^log(n))n = O(n^2)? 2^log2(n) = n, but its what does 2^log3(n) actually give?
So its O(n^2 * log(n)) = O(n^2)? Its far from Wikipedia's O(n^(log3/log1.5)) ...
The size of the problem at level k is (2/3)kn. The size at the lowest level is 1, so setting (2/3)kn = 1, the depth is k = log1.5 n (divide both sides by (2/3)k, take logs base 1.5).
The number of invocations at level k is 3k. At level k = log1.5 n, this is 3log1.5n = ((1.5)log1.53)log1.5 n = ((1.5)log1.5n)log1.5 3 = nlog1.53 = nlog 3/log 1.5.
Since the work at each level increases geometrically, the work at the leaves dominates.
You can use the Master Theorem to find this answer.
We can see from the Algorithm that the recurrence relation is:
T(n) = 3*T(2/3 n) + 1
Applying the theorem:
f(n) = 1 = O(nc), where c=0.
a = 3, b = 3/2 => log3/2(3) =~ 2.70
Since c < log3/2(3), we are at Case 1 of the Theorem, so:
T(n) = O(nlog3/2(3)) = O(n2.70)
Probably this could help:
/**********************************************************************
* Complexity:
* This algorithm makes exactly one comparison on each step.
* On each step - algorithm divide initial array of size n :
* on 3 arrays with (2*n / 3) sizes
* N
* / | \
* 2N/3 2N/3 2N/3
* /
* (2N/3)2/3
* /
* ...
* /
* N * (2/3)^k = 1
* By considering this tree we can find the depth - k:
* N * (2/3)^k = 1 =>>
* N = 1 / (2/3)^k =>>
* N = (3/2)^k =>>
* log(3/2, N) = log(3/2, (3/2)^k) =>>
* k = log(3/2, N) (!!!)
*
* On each step algorithm makes 3^k comparisons =>> on the last step we will get:
* 3^(log(3/2, N)) =>> N^(log(3/2, 3))
* comparisons.
*
* We can compute the full work:
* 1 + 3 + 9 + ... + 3^(log(3/2, N))
* by using geometric progression formulas.
*
*************************************************************************/
public static void sort(Comparable[] a, int lo, int hi) {
if (lo >= hi) return;
if (less(a[hi], a[lo])) exch(a, hi, lo);
if (hi - lo + 1 > 2) {
int t = (hi - lo + 1) / 3;
sort(a, lo, hi - t);
sort(a, lo + t, hi);
sort(a, lo, hi - t);
}
}
If we define T(n) as the answer (j-i+1 = n) we have:
T(n) = 3*T(n/1.5) + O(1)
You can solve that using Master Theorem and the answer will be theta(n^log(3,1.5)) = theta(n^(log 3 / log 1.5))
You can prove that using induction on n too.
Using recursion tree is acceptable too:
k = number of levels = log (n,1.5)
ans = 1 + 3 + 3^2 + ... + 3^k = theta(3^k) = theta(3^log(n,1.5)) = theta(n^log(3,1.5))
t(n) = 3⋅t(2*n/3) + Θ(n)
h = 1+log3/2(n)
Try to calculate. it is easy.
At every level we have complexity 3i⋅c, where c is some constant and i is the height of that particular level
t(n) = Σi=0,...,h c⋅3i
t(n) = n-2log₃(n)/2log₃(n)+1
then its a simple geometric progression.
Related
Question:
In less O(n) find a number K in sequence 1,2,3...N such that sum of 1,2,3...K is exactly half of sum of 1,2,3..N
Maths:
I know that the sum of the sequence 1,2,3....N is N(N+1)/2.
Therefore our task is to find K such that:
K(K+1) = 1/2 * (N)(N+1)/2 if such a K exists.
Pseudo-Code:
sum1 = n(n+1)/2
sum2 = 0
for(i=1;i<n;i++)
{
sum2 += i;
if(sum2 == sum1)
{
index = i
break;
}
}
Problem: The solution is O(n) but I need better such as O(n), O(log(n))...
You're close with your equation, but you dropped the divide by 2 from the K side. You actually want
K * (K + 1) / 2 = N * (N + 1) / (2 * 2)
Or
2 * K * (K + 1) = N * (N + 1)
Plugging that into wolfram alpha gives the real solutions:
K = 1/2 * (-sqrt(2N^2 + 2N + 1) - 1)
K = 1/2 * (sqrt(2N^2 + 2N + 1) - 1)
Since you probably don't want the negative value, the second equation is what you're looking for. That should be an O(1) solution.
The other answers show the analytical solutions of the equation
k * (k + 1) = n * (n + 1) / 2 Where n is given
The OP needs k to be a whole number, though, and such value may not exist for every chosen n.
We can adapt the Newton's method to solve this equation using only integer arithmetics.
sum_n = n * (n + 1) / 2
k = n
repeat indefinitely // It usually needs only a few iterations, it's O(log(n))
f_k = k * (k + 1)
if f_k == sum_n
k is the solution, exit
if f_k < sum_n
there's no k, exit
k_n = (f_k - sum_n) / (2 * k + 1) // Newton step: f(k)/f'(k)
if k_n == 0
k_n = 1 // Avoid inifinite loop
k = k - k_n;
Here there is a C++ implementation.
We can find all the pairs (n, k) that satisfy the equation for 0 < k < n ≤ N adapting the algorithm posted in the question.
n = 1 // This algorithm compares 2 * k * (k + 1) and n * (n + 1)
sum_n = 1 // It finds all the pairs (n, k) where 0 < n ≤ N in O(N)
sum_2k = 1
for every n <= N // Note that n / k → sqrt(2) when n → ∞
while sum_n < sum_2k
n = n + 1 // This inner loop requires a couple of iterations,
sum_n = sum_n + n // at most.
if ( sum_n == sum_2k )
print n and k
k = k + 1
sum_2k = sum_2k + 2 * k
Here there is an implementation in C++ that can find the first pairs where N < 200,000,000:
N K K * (K + 1)
----------------------------------------------
3 2 6
20 14 210
119 84 7140
696 492 242556
4059 2870 8239770
23660 16730 279909630
137903 97512 9508687656
803760 568344 323015470680
4684659 3312554 10973017315470
27304196 19306982 372759573255306
159140519 112529340 12662852473364940
Of course it becomes impractical for too large values and eventually overflows.
Besides, there's a far better way to find all those pairs (have you noticed the patterns in the sequences of the last digits?).
We can start by manipulating this Diophantine equation:
2k(k + 1) = n(n + 1)
introducing u = n + 1 → n = u - 1
v = k + 1 k = v - 1
2(v - 1)v = (u - 1)u
2(v2 - v) = u2 + u
2(4v2 - 4v) = 4u2 + 4u
2(4v2 - 4v) + 2 = 4u2 - 4u + 2
2(4v2 - 4v + 1) = (4u2 - 4u + 1) + 1
2(2v - 1)2 = (2u - 1)2 + 1
substituting x = 2u - 1 → u = (x + 1)/2
y = 2v - 1 v = (y + 1)/2
2y2 = x2 + 1
x2 - 2y2 = -1
Which is the negative Pell's equation for 2.
It's easy to find its fundamental solutions by inspection, x1 = 1 and y1 = 1. Those would correspond to n = k = 0, a solution of the original Diophantine equation, but not of the original problem (I'm ignoring the sums of 0 terms).
Once those are known, we can calculate all the other ones with two simple recurrence relations
xi+1 = xi + 2yi
yi+1 = yi + xi
Note that we need to "skip" the even ys as they would lead to non integer solutions. So we can directly use theese
xi+2 = 3xi + 4yi → ui+1 = 3ui + 4vi - 3 → ni+1 = 3ni + 4ki + 3
yi+2 = 2xi + 3yi vi+1 = 2ui + 3vi - 2 ki+1 = 2ni + 3ki + 2
Summing up:
n k
-----------------------------------------------
3* 0 + 4* 0 + 3 = 3 2* 0 + 3* 0 + 2 = 2
3* 3 + 4* 2 + 3 = 20 2* 3 + 3* 2 + 2 = 14
3*20 + 4*14 + 3 = 119 2*20 + 3*14 + 2 = 84
...
It seems that the problem is asking to solve the diophantine equation
2K(K+1) = N(N+1).
By inspection, K=2, N=3 is a solution !
Note that technically this is an O(1) problem, because N has a finite value and does not vary (and if no solution exists, the dependency on N is even meanignless).
The condition you have is that the sum of 1..N is twice the sum of 1..K
So you have N(N+1) = 2K(K+1) or K^2 + K - (N^2 + N) / 2 = 0
Which means K = (-1 +/- sqrt(1 + 2(N^2 + N)))/2
Which is O(1)
I'm trying to solve the recurrence relation T(n) = 3T(n-1) + n and I think the answer is O(n^3) because each new node spawns three child nodes in the recurrence tree. Is this correct? And, in terms of the recurrence tree, is there a more mathematical way to approach it?
Recurrence equations are just math: Unfold, sum up and simplify.
T(n) = 3T(n-1) + n = 3 (3 T(n - 2) + (n-1)) + n
= 3^2 T(n - 3) + 3^2(n-2) + 3(n-1) + n)
= ...
= 3^i T(n - i) + Σ_(j=0..i-1) 3^j * (n-j)
= Σ_(j=0..n-1) 3^j * (n-j) // assuming T(0) = 0
Now we can find different upper bounds depending on how we much we want to think about it:
T(n) = Σ_(j=0..n-1) 3^j * (n-j)
< n * Σ_(j=0..n-1) 3^j
= n * (3^n - 1)
So T(n) = O(n*3^n).
You can also get a tighter bound by splitting the sum up into two parts:
T(n) = Σ_(j=0..n-1) 3^j * (n-j)
< n * Σ_(j=0..n-x) 3^j * (n-j) + x * Σ_(j=n-x..n-1) 3^j
= n (3^(n-x+1) - 1) + x * (3^n - 3^(n-x+1))
using x = log_3(n) you get T(n) = O(3^n * log(n)).
You can also approximate the sum Σ(n-i)3^i using an integral and get the exact complexity Θ(3^n).
need some help with question from Data Structure course:
I was given this recursive function of mergesort (pseudo code):
Mergesort_1/3(A, p, r)
if p < r
then q = (p+(r-p)/3) // round the result down
Mergesort_1/3 (A,p,q)
Mergesort_1/3 (A,q+1,r)
Merge(A,p,q,r)
and these are the questions:
Let T(n) be the worst case running time of Mergesort _1/3. Write the recursive function for T(n). Give a short explanation.
Prove that T(n)=Ω(nlogn)
The gist of classic mergesort is the following recursion:
Split an unordered list in half.
It suffices to compute the limit offsets of the partial lists.
Apply the mergesort to each of the partial lists.
After this step, the partial lists will be sorted.
Merge the sorted partial lists by inspecting the elements of both lists in their respective order, coyping from and progressing in the list with the smaller element.
Let TC(n) be the time complexity of classic mergesort. The aforementioned steps take O(1) (*), 2*O(TC(ceil(n/2))), and O(n), respectively. This lends to the recursion TC(n) = cc_0 + cc_1 * n + 2 * TC(ceil(n/2)).
Consider the generalized mergesort where lists are split unevenly, though always with the same ratio. The complexity of splitting and merging remains the same, lending to the recursion TG(n) = ca_0 + ca_1 * n + TG(1/a * n + 1) + TG((a-1)/a * n + 1)) for the generalized mergesort (using TG(x+1) instead of TG(ceil(x)); ca_0, ca_1 being the constants hidden in the O(_) notation; a=3 for mergesort_1/3).
This recurrence can be solved using the Akra-Bazzi-Method.
To this end, the recurrence needs to be written as
TG(x) = g(x) + \sum_i=1..k ( a_i * TG(b_i * x + h_i(x) ) ; x >= x_0.
with
a_i, b_i const.
a_i > 0
0 < b_i < 1
|g(x)| = O(x^c); c const.
|h(x)| = O(x / (log(x))^2)
which can be done by setting ...
k = 2
a_1 = 1
a_2 = 1
b_1 = 1/a
b_2 = (a-1)/a
g(x) = ca_0 + ca_1 * x
h_1(x) = 1
h_2(x) = 1
x_0 = 2
-> TG(x) = ca_0 + ca_1 * n + 1 * TG(1/a * x + 1 ) + 1 * TG((a-1)/a * x + 1 ) ; x >= x_0.
The Akra-Bazzi theorem requires the exponent p be found in \sum_i=1..k ( a_i * (b_i)^p ) = 1. Then the following holds:
TG(x) = Θ ( x^p \integral_(1, x) ( g(u) / (u^(p+1)) du ) )
Specifically,
a_1 * b_1^p + a_2 * b_2^p = 1
=> (1/a)^p + ((a-1)/a)^p = 1
<=> p = 1
... and thus ...
TG(x) = Θ ( x \integral_(1, x) ( (ca_0 + ca_1 * u) / u^2 du ) )
= Θ ( x [ - ca_0/u + ca_1 * log(u) ]_(1,x) )
= Θ ( - ca_0 + ca_1 * x * log(x) + ca_0 * x - ca_1 * x * log(1) )
= Θ (x * log(x))
(*) Strictly speaking this is incorrect since basic arithmetics on binary representations and memory access is O(log n). However, this makes no difference asymptotically for Θ(n log n) complexity.
Instead of two, this mergesort makes partitions the data in three equal size chunks. So you are reducing a problem of size n to three problems of size n/3. Further, when merging, you will have to go through all the three n/3 sized sorted chunks which will result in going through total n elements. Hence, the recurrence can be written as:
T(n) = 3 T(n / 3) + O(n)
Using Master Theorem: Here a = 3, b = 3 and c = 1. Logba = Log33 = 1 = c.
So the recurrence will fall in the second case and T(n) = Θ(nc * Log(n)) = Θ(n * Log(n)).
1.Given that T(0)=1, T(n)=T([2n/3])+c (in this case 2n/3 is lower bound). What is big-Θ bound for T(n)? Is this just simply log(n)(base 3/2). Please tell me how to get the result.
2.Given the code
void mystery(int n) {
if(n < 2)
return;
else {
int i = 0;
for(i = 1; i <= 8; i += 2) {
mystery(n/3);
}
int count = 0;
for(i = 1; i < n*n; i++) {
count = count + 1;
}
}
}
According to the master theorem, the big-O bound is n^2. But my result is log(n)*n^2 (base 3) . I'm not sure of my result, and actually I do not really know how to deal with the runtime of recursive function. It is just simply the log function?
Or what if like in this code T(n)=4*T(n/3)+n^2?
Cheers.
For (1), the recurrence solves to c log3/2 n + c. To see this, you can use the iteration method to expand out a few terms of the recurrence and spot a pattern:
T(n) = T(2n/3) + c
= T(4n/9) + 2c
= T(8n/27) + 3c
= T((2/3)k n) + kc
Assuming that T(1) = c and solving for the choice of k that makes the expression inside the parentheses equal to 1, we get that
1 = (2/3)k n
(3/2)k = n
k = log3/2
Plugging in this choice of k into the above expression gives the final result.
For (2), you have the recurrence relation
T(n) = 4T(n/3) + n2
Using the master theorem with a = 4, b = 3, and d = 2, we see that logb a = log3 4 < d, so this solves to O(n2). Here's one way to see this. At the top level, you do n2 work. At the layer below that, you have four calls each doing n2 / 9 work, so you do 4n2 / 9 work, less than the top layer. The layer below that does 16 calls that each do n2 / 81 work for a total of 16n2 / 81 work, again much work than the layer above. Overall, each layer does exponentially less work than the layer above it, so the top layer ends up dominating all the other ones asymptotically.
Let's do some complexity analysis, and we'll find that the asymptotic behavior of T(n) depends on the constants of the recursion.
Given T(n) = A T(n*p) + C, with A,C>0 and p<1, let's first try to prove T(n)=O(n log n). We try to find D such that for large enough n
T(n) <= D(n * log(n))
This yields
A * D(n*p * log(n*p)) + C <= D*(n * log(n))
Looking at the higher order terms, this results in
A*D*p <= D
So, if A*p <= 1, this works, and T(n)=O(n log n).
In the special case that A<=1 we can do better, and prove that T(n)=O(log n):
T(n) <= D log(n)
Yields
A * D(log(n*p)) + C <= D*(log(n))
Looking at the higher order terms, this results in
A * D * log(n) + C + A * D *log(p) <= D * log(n)
Which is true for large enough D and n since A<=1 and log(p)<0.
Otherwise, if A*p>1, let's find the minimal value of q such that T(n)=O(n^q). We try to find the minimal q such that there exists D for which
T(n) <= D n^q
This yields
A * D p^q n^q + C <= D*n^q
Looking at the higher order terms, this results in
A*D*p^q <= D
The minimal q that satisfies this is defined by
A*p^q = 1
So we conclude that T(n)=O(n^q) for q = - log(A) / log(p).
Now, given T(n) = A T(n*p) + B n^a + C, with A,B,C>0 and p<1, try to prove that T(n)=O(n^q) for some q. We try to find the minimal q>=a such that for some D>0,
T(n) <= D n^q
This yields
A * D n^q p^q + B n^a + C <= D n^q
Trying q==a, this will work only if
ADp^a + B <= D
I.e. T(n)=O(n^a) if Ap^a < 1.
Otherwise we get to Ap^q = 1 as before, which means T(n)=O(n^q) for q = - log(A) / log(p).
I'm trying to calculate the following :
f(n) = ∑ (i*log(i)) , when i=1 to log(n) .
How do I do that ?
I have succeeded doing :
f(n) = ∑ (i*log(i)) , when i=1 to n .
Which is : 1*log(1) + 2*log(2) + ... + n*log(n) <= n(n*log(n))
Where in the end : f(n) = ∑ (i*log(i)) = Ω(n^2 log^2(n) ) (Where i=1 to n)
But I don't know how to do the first one , any idea anybody ?
Regards
First, you have to remove ^2 from log^2(n) in your current result would be
f(n) = ∑ (i*log(i)) <= n(n*log(n)) = Ω(n^2*log(n))
Then, for the case where i goes from 1 to log(n), just substitute n by log(n).
Let's define
g(n) = ∑ (i*log(i)), when i=1 to log(n) // The result you are looking for
f(n) = ∑ (i*log(i)), when i=1 to n // The result we have
Then
g(n) = f(log(n)) = Ω(log(n)^2*log(log(n)))
f(n) = Theta(log2(n) * log(log(n))
Proof:
f(n) = 1 * log(1) + 2 * log(2) + ... + log(n) * log(log(n)) <=
<= log(n)*log(log(n)) * log(n) =
= O(log^2(n) * loglog(n))
f(n) = 1 * log(1) + 2 * log(2) + ... + log(n) * log(log(n)) >=
>= log(n/2) * log(log(n/2)) + log(n/2 + 1) * log(log(n/2 + 1) + ... + log(n) * log(log(n)) >=
>= log(n/2) * log(log(n/2)) + ... + log(n/2) * log(log(n/2)) =
= log(n/2) * log(log(n/2)) * log(n/2)
= log^2(n/2)*log(log(n/2)) = log^2(n/2)*log(log(n)-log(2)) =
= Omega(log^2(n)*loglog(n))
If you know some calculus, you can often find the order of growth of such sums by integration.
If f is a positive monotonic function, ∑ f(i) for 1 <= i <= k can be approximated by the integral ∫ f(t) dt (t ranging from 1 to k). So if you know a primitive function F of f (in modern parlance an antiderivative), you can easily evaluate the integral to F(k) - F(1). For growth analysis, the constant term F(1) is irrelevant, so you can approximate the sum (as well as the integral) simply by F(k).
A tool that is often useful in such calculations is partial integration,
b b
∫ f'(t)*g(t) dt = f(b)*g(b) - f(a)*g(a) - ∫ f(t)*g'(t) dt
a a
which follows from the product rule (f*g)' = f' * g + f * g'. It is often helpful to write f as 1*f in order to apply partial integration, for example to find a primitive of the (natural) logarithm,
∫ log t dt = ∫ 1*log t dt = t*log t - ∫ t * (log t)' dt = t*log t - ∫ t*(1/t) dt = t*log t - t
In this case, with f(t) = t*log t, partial integration yields
∫ t*log t dt = 1/2*t^2 * log t - ∫ (1/2*t^2) * (log t)' dt
= 1/2*t^2 * log t - 1/2 ∫ t^2*(1/t) dt
= 1/2*t^2 * log t - 1/4*t^2
Since the second term grows slower than the first, it can be ignored for growth analysis, so you obtain
k
∑ i*log i ≈ 1/2*k^2*log k
1
Since logarithms to different bases only differ by a constant factor, a different choice of logarithm just changes the constant factor, and you see that in all cases
k
∑ i*log i ∈ Θ(k^2 * log k)
1
For your specific problem, k = log n, so the sum is Θ((log n)^2 * log(log n)), as has been derived in a different way by the other answers.
http://img196.imageshack.us/img196/7012/5f1ff74e3e6e4a72bbd5483.png
now subtitute n for logn and you'll get it's VERY tightly bounded by log^2(n)*log(log(n))