My Question
What is the Big-O complexity for this code snippet?
Consider n as a power of 4.
for(int i = 1; i <= n; i = i*4)
for(int k = 1; k <= i; k++)
// constant statement
What I know so far
I tried making this code into a summation to find the complexity.
This is what I got:
I got (base 4) log(n) by computing the series 4, 4^2, 4^3 ... 4^r = n.
r = (base 4) log(n).
I'm now stuck at this summation:
Please let me know If I'm doing something wrong or there is another way to do this.
You’re on the right track here, but your innermost summation is wrong. You are right that the outer loop will iterate log_4 n times, but you’ve set up the outer sum so that i counts up as 1, 2, 3, ..., log_4 n, rather than 4^0, 4^1, 4^2, ... 4^log_4 n. As a result, that inner summation’s upper bound is incorrect. The bound should be 4^i, not i.
If you set things up this way, you’ll find that the overall sum is
4^0 + 4^1 + 4^2 + ... + 4^log_4 n
= (4^(log_4 n + 1) - 1) / (4 - 1) (using the formula for the sum of a geometric series
= (4(4^log_4 n) - 1) / 3
= (4n - 1) / 3
= Θ(n).
You may use wolframalpha to get the result for extreme accuracy.
https://www.wolframalpha.com/input/?i=sum+i,+i%3D1+to+log_4(n)
Related
I've been trying to set up clear information on iterating through and i and then j, but i get stuck when trying to make sense of the while loop?
Can someone please give me some information on how to solve something like this please?
Disclaimer: this answer is long and overly verbose because I wanted to provide the OP with a "baby-steps" method rather than a result. I hope she can find some help from it - would it be needed.
If you get stuck when trying to derive the complexity in one go, you can try to breakdown the problem into smaller pieces easier to reason about.
Introducing notations can help in this context to structure your thought process.
Let's introduce a notation for the inner while loop. We can see the starting index j depends on n - i, so the number of operations performed by this loop will be a function of n and i. Let's represent this number of operations by G(i, n).
The outer loop depends only on n. Let's represent the number of operations by T(n).
Now, let's write down the dependency between T(n) and G(n, i), reasoning about the outer loop only - we do not care about the inner while loop for this step because we have already abstracted its complexity in the function G. So we go with:
T(n) = G(n, n - 1) + G(n, n - 2) + G(n, n - 3) + ... + G(n, 0)
= sum(k = 0, k < n, G(n, k))
Now, let's focus on the function G.
Let's introduce an additional notation and write j(t) the value of the index j at the t-th iteration performed by the while loop.
Let's call k the value of t for which the invariant of the while loop is breached, i.e. the last time the condition is evaluated.
Let's consider an arbitrary i. You could try with a couple of specific values of i (e.g. 1, 2, n) if this helps.
We can write:
j(0) = n - i
j(1) = n - i - 3
j(2) = n - i - 6
...
j(k - 1) = n - i - 3(k - 1) such that j(k-1) >= 0
j(k) = n - i - 3k such that j(k) < 0
Finding k involves solving the inequality n - 1 - 3k < 0. To make it easier, let's "ignore" the fact that k is an integer and that we need to take the integer part of the result below.
n - i - 3k < 0 <=> k = (n - i) / 3
So there are (n - i) / 3 "steps" to consider. By steps we refer here to the number of evaluation of the loop condition. The number of times the operation j <- j - 3 is performed would be the latter minus one.
So we found an expression for G(n, i):
G(n, i) = (n - i) / 3
Now let's get back to the expression of T(n) we found in (3):
T(n) = sum(k = 0, k < n, (n - k) / 3)
Since when k varies from 0 to n, n - k varies from n to 0, we can equivalently write T(n) as:
T(n) = sum(k = 0, k <= n, k / 3)
= (1/3).sum(k = 0, j <= n, k)
= (1/6).n.(n+1)
And you can therefore conclude with
T(n) = Theta(n^2)
This resolution exhibited some patterns from which you can create your own recipe to solve similar exercises:
Consider the number of operations at individual levels (one loop at a time) and introduce notations for the functions representing them;
Find relationships between these functions;
Find an algebraic expression of the most inner functions, which doesn't depend on other functions you introduced and for which the number of steps can be calculated from a loop invariant;
Using the relationships between these functions, find expressions for the functions higher in the stack.
In order to calculate all time-complexity of code, replace each loop with a summation. Moreover, consider that the second loop run (n - i)/3 since j decreases with step size of 3. So we have:
I'm trying to analyze the worst case order of growth as a function of N for this algorithm:
for (int i = N*N; i > 1; i = i/2)
for (int j = 0; j < i; j++) {
total++;
}
What I'm trying is to analyze how many times the line total++ will run by looking at the inner and outer loops. The inner loop should run (N^2)/2 times. The outer loop I don't know. Could anyone point me in the right direction?
The statement total++; shall run following number of times:
= N^2 + N^2 / 2 + N^2 / 4 ... N^2 / 2^k
= N^2 * ( 1 + 1/2 + 1/4 + ... 1/2^k )
The number of terms in the above expression = log(N^2) = 2log(N).
Hence sum of series = N^2 * (1 - 1/2^(2logN)) / (1/2)
= N^2 * (1 - 1/4N) / (1/2).
Hence according to me the order of complexity = O(N^2)
The outer loop would run with a complexity of log(N) as the series reduces to half on every iteration . For example a binary search.
The outer loop runs exactly 2LOG (base 2) N + 1 times (Float to int conversion and remove decimal places). If you see the value decreases like N^2,N^2/2 , N^2/4 ... 1. ..
So the total number of times total ++ runs is,
Summazion( x from 0 to int(2LOG (base 2) N + 1)) N^2/2^x
for this question as the inner loop is depending upon the value of the variable that is changing by the outer loop (so u cant solve this simply by multiplying the values of inner and the outer loops). u will have to start writing the values in a and then try to figure out the series and then solve the series to get the answer..
like in your question, total++ will run..
n^2 + n^2/2 + n^2/2^2 + n^2/2^3 + .....
then, taking n^2 common, we get
n^2 [ 1 + 1/2 + 1/2^2 + 1/2^3 + ...]
solve this series to get the answer
I'm having trouble understanding how to make this into a formula.
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= N; j += i) {
I realize what happens, for every i++ you have 1 level of multiplication less of j.
i = 1, you get j = 1, 2, 3, ..., 100
i = 2, you get j = 1, 3, 5, ..., 100
I'm not sure how to think this in terms of Big-theta.
The total of j is N, N/2, N/3, N/4..., N/N (My conclusion)
How would be best to try and think this as a function of N?
So your question can be actually reduced to "What is the tight bound for the harmonic series 1/1 + 1/2 + 1/3 + ... + 1/N?" For which the answer is log N (you can consider it as continuous sum instead of discrete, and notice that the integral of 1/N is log N)
Your harmonic series is the formula of the whole algorithm (as you have correctly concluded)
So, your sum:
N + N/2 + N/3 + ... + N/N = N * (1 + 1/2 + 1/3 + ... + 1/N) = Theta(N * log N)
So the tight bound for the algorithm is N*log N
See the [rigorous] mathematical proof here (see the "Integral Test" and "Rate of Divergence" part)
Well, you can methodically use Sigma notation:
Consider the following randomized search algorithm on a sorted array a of length n (in increasing order). x can be any element of the array.
size_t randomized_search(value_t a[], size_t n, value_t x)
size_t l = 0;
size_t r = n - 1;
while (true) {
size_t j = rand_between(l, r);
if (a[j] == x) return j;
if (a[j] < x) l = j + 1;
if (a[j] > x) r = j - 1;
}
}
What is the expectation value of the Big Theta complexity (bounded both below and above) of this function when x is selected randomly from a?
Although this seems to be log(n), I carried out an experiment with instruction count, and found out that the result grows a little faster than log(n) (according to my data, even (log(n))^1.1 better fit the result).
Someone told me that this algorithm has an exact big theta complexity (so obviously log(n)^1.1 is not the answer). So, could you please give the time complexity along with your approach to prove it? Thanks.
Update: the data from my experiment
log(n) fit result by mathematica:
log(n)^1.1 fit result:
If you're willing to switch to counting three-way compares, I can tell you the exact complexity.
Suppose that the key is at position i, and I want to know the expected number of compares with position j. I claim that position j is examined if and only if it's the first position between i and j inclusive to be examined. Since the pivot element is selected uniformly at random each time, this happens with probability 1/(|i - j| + 1).
The total complexity is the expectation over i <- {1, ..., n} of sum_{j=1}^n 1/(|i - j| + 1), which is
sum_{i=1}^n 1/n sum_{j=1}^n 1/(|i - j| + 1)
= 1/n sum_{i=1}^n (sum_{j=1}^i 1/(i - j + 1) + sum_{j=i+1}^n 1/(j - i + 1))
= 1/n sum_{i=1}^n (H(i) + H(n + 1 - i) - 1)
= 1/n sum_{i=1}^n H(i) + 1/n sum_{i=1}^n H(n + 1 - i) - 1
= 1/n sum_{i=1}^n H(i) + 1/n sum_{k=1}^n H(k) - 1 (k = n + 1 - i)
= 2 H(n + 1) - 3 + 2 H(n + 1)/n - 2/n
= 2 H(n + 1) - 3 + O(log n / n)
= 2 log n + O(1)
= Theta(log n).
(log means natural log here.) Note the -3 in the low order terms. This makes it look like the number of compares is growing faster than logarithmic at the beginning, but the asymptotic behavior dictates that it levels off. Try excluding small n and refitting your curves.
Assuming rand_between to implement sampling from a uniform probability distribution in constant time, the expected running time of this algorithm is Θ(lg n). Informal sketch of a proof: the expected value of rand_between(l, r) is (l+r)/2, the midpoint between them. So each iteration is expected to skip half of the array (assuming the size is a power of two), just like a single iteration of binary search would.
More formally, borrowing from an analysis of quickselect, observe that when you pick a random midpoint, half of the time it will be between ¼n and ¾n. Neither the left nor the right subarray has more than ¾n elements. The other half of the time, neither has more than n elements (obviously). That leads to a recurrence relation
T(n) = ½T(¾n) + ½T(n) + f(n)
where f(n) is the amount of work in each iteration. Subtracting ½T(n) from both sides, then doubling both sides, we have
½T(n) = ½T(¾n) + f(n)
T(n) = T(¾n) + 2f(n)
Now, since 2f(n) = Θ(1) = Θ(n ᶜ log⁰ n) where c = log(1) = 0, it follows by the master theorem that T(n) = Θ(n⁰ lg n) = Θ(lg n).
So for a homework we had to count the number of steps in a piece of code. Here it is:
int sum = 0;
for (int i = 1; i <= n*n; i++)
for (int j = 1; j <= i; j++)
for (int k = 1; k <= 6; k++)
sum++;
My prof (i think) explained that the number of operations in the 2nd line could be found using summation notation, like so:
n^2
Σ x 4 + 3
i=1
which would be 1/2(n^4 + n^2) x 4 + 3 = 2n^4 + 2n^2 + 3
but from just looking the line, I would think it would be something like 4n^4 + 2 (my prof said 4n^4 + 3, I'm not sure where the third operation is though...)
Am I doing the summation notation wrong here? It made sense to me to do summation notation for nested for loops, but I don't know why it would work for a for loop by itself.
Thanks.
Actually even your prof result is wrong. The exact result is 3n^4+3n^2.
To obtain that result simply consider:
All passages are pretty simple (the passage from step 4 to step 5 is immediate if you consider the formula for the sum of the firsts n natural numbers).
I guess both you and your professor are wrong. According to my calculation (I might be wrong too) it should be 3n^4+3n^2.
The outer most loop will run n^2 times. Taken this into consideration the inner loop will run 1 time for the first iteration and so on till n^2. i.e. from j=1 to j=1,2,3,4 ... n^2. If we sum the series (1+2+3...n^2) this becomes (n^2(n^2+1))/2.
So for n^2 iterations of outer loop the inner loop will execute (n^2(n^2+1))/2 times. The most inner loop executes six times for every iteration of the second loop. So by just multiplying (n^2(n^2+1))/2 with 6 it evaluates to 3n^4+3n^2.
To check the answer let's take an example. Say n=5, run your algorithm and print the sum this will give 1950. Now substitute this value in the evaluated expression, this will be like 3(5^4)+3(5^2) and again this evaluates to 1950.
What you need to calculate is this:
S = sum(i in 1..n^2) sum(j in 1..i) sum(k in 1..6) 1
Now, the innermost sum is obviously 6, hence we have
S = sum(i in 1..n^2) sum(j in 1..i) 6
= 6 sum(i in 1..n^2) sum(j in 1..i) 1
The innermost sum is just the sum of the first i numbers, which you should know is i(i + 1)/2, giving
S = 6 sum(i in 1..n^2) i(i + 1)/2
= 3 sum(i in 1..n^2) i(i + 1)
= 3 sum(i in 1..n^2) (i^2 + i)
We can separate this into two sums:
S = 3 [ (sum(i in 1..n^2) i^2) + (sum(i in 1..n^2) i) ]
The second sum there is just our old friend, the sum of the first n^2 numbers, so expanding that is easy.
The first sum there is a new friend, the sum of the first n^2 squares. You can google for that if you don't know it off hand.
Drop in the formulae, expand a little, tidy with a broom, and you should get your answer.
Cheers!