Drawing Recurrence Tree and Analysis - big-o

I am watching Intro to Algorithms (MIT) lecture 1. Theres something like below (analysis of merge sort)
T(n) = 2T(n/2) + O(n)
Few questions:
Why work at bottom level becomes O(n)? It said that the boundary case may have a different constant ... but I still don't get it ...
Its said total = cn(lg n) + O(n). Where does O(n) part come from? The original O(n)?

Although this one's been answered a lot, here's one way to reason it:
If you expand the recursion you get:
t(n) = 2 * t(n/2) + O(n)
t(n) = 2 * (2 * t(n/4) + O(n/2)) + O(n)
t(n) = 2 * (2 * (2 * t(n/8) + O(n/4)) + O(n/2)) + O(n)
...
t(n) = 2^k * t(n / 2^k) + O(n) + 2*O(n/2) + ... + 2^k * O(n/2^k)
The above stops when 2^k = n. So, that means n = log_2(k).
That makes n / 2^k = 1 which makes the first part of the equality simple to express, if we consider t(1) = c (constant).
t(n) = n * c + O(n) + 2*O(n/2) + ... + (2^k * O(n / 2^k))
If we consider the sum of O(n) + .. + 2^k * O(n / 2^k) we can observe that there are exactly k terms, and that each term is actually equivalent to n. So we can rewrite it like so:
t(n) = n * c + {n + n + n + .. + n} <-- where n appears k times
t(n) = n * c + n *k
but since k = log_2(n), we have
t(n) = n * c + n * log_2(n)
And since in Big-Oh notation n * log_2(n) is equivalent to n * log n, and it grows faster than n * c, it follows that the Big-O of the closed form is:
O(n * log n)
I hope this helps!
EDIT
To clarify, your first question, regarding why work at the bottom becomes O(n) is basically because you have n unit operations that take place (you have n leaf nodes in the expansion tree, and each takes a constant c time to complete). In the closed-formula, the work-at-the-bottom is expressed as the first term in the sum: 2 ^ k * t(1). As I said above, you have k levels in the tree, and the unit operation t(1) takes constant time.
To answer the second question, the O(n) does not actually come from the original O(n); it represents the work at the bottom (see answer to first question above).
The original O(n) is the time complexity required to merge the two sub-solutions t(n/2). Since the time complexity of the merge operation is assumed to grow (or decrease) linearly with the size of the problem, that means that at each level you will have a sum of O(n / 2^level), of 2^level terms; this is equivalent to one O(n) operation performed once. Now, since you have k levels, the merge complexity for the initial problem is {O(n) at each level} * {number of levels} which is essentially O(n) * k. Since k = log(n) levels, it follows that the time complexity of the merge operation is: O(n * log n).
Finally, when you examine all the operations performed, you see that the work at the bottom is less than the actual work performed to merge the solutions. Mathematically speaking, the work performed for each of the n items, grows asymptotically slower than the work performed to merge the sub-solutions; put differently, for large values of n, the merge operation dominates. So in Big-Oh analysis, the formula becomes: O(n * log(n)).

Related

How do you find the complexity of an algorithm given the number of computations performed each iteration?

Say there is an algorithm with input of size n. On the first iteration, it performs n computations, then is left with a problem instance of size floor(n/2) - for the worst case. Now it performs floor(n/2) computations. So, for example, an input of n=25 would see it perform 25+12+6+3+1 computations until an answer is reached, which is 47 total computations. How do you put this into Big O form to find worst case complexity?
You just need to write the corresponding recurrence in a formal manner:
T(n) = T(n/2) + n = n + n/2 + n/4 + ... + 1 =
n(1 + 1/2 + 1/4 + ... + 1/n) < 2 n
=> T(n) = O(n)

Calculating the Recurrence Relation T(n)=T(n / [(log n)^2]) + Θ(1)

I tried to solve this problem many hours and I think the solution is O(log n/[log (log n)^2]). but I'm not sure.Is this solution correct?
Expand the equation:
T(n) = (T(n/(log^2(n)*log(n/log^2(n))^2) + Theta(1)) Theta(1) =
T(n/(log^4(n) + 4 (loglog(n))^2 - 4log(n)loglog(n)) + 2 * Theta(1)
We know n/(log^4(n) + 4 (log(log(n)))^2 - 4log(n)log(log(n)) is greater than n/log^4(n) asymptotically. As you can see, each time n is divided by log^2(n). Hence, we can say if we compute the height of dividing n by log^2(n) up to reaching to 1, it will be a lower bound for T(n).
Hence, the height of the expansion tree will be k such that
n = (log^2(n))^k = lof^2k(n) =>‌ (take a log)
log(n) = 2k log(log(n)) => k = log(n)/(2 * log(log(n)))
Therefore, T(n) = Omega(log(n)/log(log(n))).
For the upper bound, as we know that n/(i-th statement) <‌ n/log^i(n) (instead of applying log^2(n), we've applied log(n)), we can say the height of division of n by log(n) will be an upper bound for T(n). Hence, as:
n = log^k(n) => log(n) = k log(log(n)) => k = log(n) / log(log(n))
we can say T(n) = O(log(n) / log(log(n))).

Solve the recurrence of T(n) = 3T(n/5) + T(n/2) + 2^n

I am solving the recurrence of T(n) under the assumption that T(n) is contact for n <= 2. I started to solve this T(n) with the tree-method since we cannot use the master method here, but when I do the tree I am of course calculating the time C for this T(n) but my C-s are very non-trivial and weird, so I get for
c = 2^n and then for the next c I get ' 3 * 2^(n/5) + 2^(n/3)
And I don't how to solve with these values, is there anything that I am doing wrong or what procedure should I follow in order to solve this?
You might want to reduce the number of terms down as much as you can.
3 * 2^(n/5) + 2^(n/3) = 3 * (2^(1/5) * 2^n) + (2^(1/3) * 2^n)
Then combine all the coefficients together.
(3 * 2^(1/5)) * 2^n + (2^(1/3)) * 2^n
Notice that the common factor is 2^n. So you would get:
(3 * 2^(1/5) + 2^(1/3)) * 2^n
and I'm going to name the first part of the product as constant which
will give us:
constant * 2^n which is just T(2^n) because the constant is insignificant as the size of n gets very large.
You can simplify the case. As T(n) is increasing we know that T(n/2) > T(n/5). Hence, T(n) < 4T(n/2) + 2^n. Now, you can use master theorem, and say that T(n)=O(2^n). On the other hand, without this replacement, as there exists 2^n in T(n), we can say T(n) = \Omega(2^n). Therefore, T(n) = \Theta(2^n).

What is the complexity of sum of log functions

I have an algorithm that have the following complexity in big O:
log(N) + log(N+1) + log(N+2) + ... + log(N+M)
Is it the same as log(N+M) since it is the largest element?
OR is it M*log(N+M) because it is a sum of M elements?
The important rules to know in order to solve it are:
Log a + Log b = Log ab, and
Log a - Log b = Log a/b
Add and subtract Log 2, Log 3, ... Log N-1 to the given value.
This will give you Log 2 + Log 3 + ... + Log (N+M) - (Log 2 + Log 3 + ... + Log (N-1))
The first part will compute to Log ((N+M)!) and the part after the subtraction sign will compute to Log ((N-1)!)
Hence, this complexity comes to Log ( (N+M)! / (N-1)! ).
UPDATE after OP asked another good question in the comment:
If we have N + N^2 + N^3, it will reduce to just N^3 (the largest element), right? Why we can't apply the same logic here - log(N+M) - largest element?
If we have just two terms that look like Log(N) + Log(M+N), then we can combine them both and say that they will definitely be less than 2 * Log(M+N) and will therefore be O(Log (M+N)).
However, if there is a relation between the number of items being summed and highest value of the item, then the presence of such a relation makes the calculation slightly not so straightforward.
For example, the big O of addition of 2 (Log N)'s, is O(Log N), while the big O of summation of N Log N's is not O(Log N) but is O(N * Log N).
In the given summation, the value and the number of total values is dependent on M and N, therefore we cannot put this complexity as Log(M+N), however, we can definitely write it as M * (Log (M+N)).
How? Each of the values in the given summation is less than or equal to Log(M + N), and there are total M such values. Hence the summation of these values will be less than M * (Log (M+N)) and therefore will be O(M * (Log (M+N))).
Thus, both answers are correct but O(Log ( (N+M)! / (N-1)! )) is a tighter bound.
If M does not depend on N and does not vary then the complexity is O(log(N))
For k such as 0 <= k <= M and N>=M and N>=2,
log(N+k)=log(N(1+k/N)) = log(N) + log(1+k/N) <= log(N) + log(2)
<= log(N) + log(N) <= 2 log(N)
So
log(N) + log(N+1) + log(N+2) + ... + log(N+M) <= (M+1)2 log(N)
So the complexity in big O is: log(N)
To answer your questions:
1) yes because there is a fixed number of elements all less or equal than log(N+M)
2) In fact there are M + 1 elements (from 0 to M)
I specify that O((M+1)log(N+M)) is a O(log(N))

Is the big-O complexity of these functions correct?

I am learning about algorithm complexity, and I just want to verify my understanding is correct.
1) T(n) = 2n + 1 = O(n)
This is because we drop the constants 2 and 1, and we are left with n. Therefore, we have O(n).
2) T(n) = n * n - 100 = O(n^2)
This is because we drop the constant -100, and are left with n * n, which is n^2. Therefore, we have O(n^2)
Am I correct?
Basically you have those different levels determined by the "dominant" factor of your function, starting from the lowest complexity :
O(1) if your function only contains constants
O(log(n)) if the dominant part is in log, ln...
O(n^p) if the dominant part is polynomial and the highest power is p (e.g. O(n^3) for T(n) = n*(3n^2 + 1) -3 )
O(p^n) if the dominant part is a fixed number to n-th power (e.g. O(3^n) for T(n) = 3 + n^99 + 2*3^n)
O(n!) if the dominant part is factorial
and so on...

Resources