recursion tree and binary tree cost calculation - algorithm

I've got the following recursion:
T(n) = T(n/3) + T(2n/3) + O(n)
The height of the tree would be log3/2 of 2. Now the recursion tree for this recurrence is not a complete binary tree. It has missing nodes lower down. This makes sense to me, however I don't understand how the following small omega notation relates to the cost of all leaves in the tree.
"... the total cost of all leaves would then be Theta (n^log3/2 of 2) which, since log3/2 of 2 is a constant strictly greater then 1, is small omega(n lg n)."
Can someone please help me understand how the Theta(n^log3/2 of 2) becomes small omega(n lg n)?

OK, to answer your explicit question about why n^(log_1.5(2)) is omega(n lg n):
For all k > 1, n^k grows faster than n lg n. (Powers grow faster than logs eventually). Therefore since 2 > 1.5, log_1.5(2) > 1, and thus n^(log_1.5(2)) grows faster than n lg n. And since our function is in Theta(n^(log_1.5(2))), it must also be in omega(n lg n)

Related

Is CLRS completely accurate to state that max-heapify running time is described by the recurrence `T(n) = T(2n/3) + O(1)`?

In CLRS on page 155, about max-heaps, the running time of max-heapify is described as T(n) = T(2n/3) + O(1).
I understand why the first recursive call is on a subproblem of size 2n/3 in the case where we have a nearly complete binary tree (always the case with heaps) in which the deepest level of nodes is half full (and we are recursing on the child that is the root of the subtree that contains these nodes at the deepest level). A more in depth explanation of this is here.
What I don't understand is: after that first recursive call, the subtree is now a complete binary tree, so the next recursive calls will be on problems of size n/2.
So is it accurate to simply state that the running time of max-heapify is described by the recurrence T(n) = T(2n/3) + O(1)?
Converting my comment to an answer: if you assume that T(n), the time required to build a max-heap with n nodes, is a nondecreasing function of n, then we know that T(m) ≤ T(n) for any m ≤ n. You're correct that the ratio of 2n / 3 is the worst-case ratio and that after the first level of the recurrence it won't be reached, but under the above assumption you can safely conclude that T(n / 2) ≤ T(2n / 3), so we can upper-bound the recurrence as
T(n) ≤ T(2n / 3) + O(1)
even if strict equality doesn't hold. That then lets us use the master theorem to conclude that T(n) = O(log n).

MergeSort - Divide a sequence in 2 sub sequences not equal

A question from a test:
the division of the array its not regular.
the array will be divided in 2 not equal sub sequences:
(n/3) the first subsequence
(2/3)*n the second subsequence
Calculate the cost of mergetsort.
How can I resolve/deal with problems like these when the division is not regular?
mid = (start + last)/3;
mergesort (array , start , mid);
mergesort (array , mid+1 , last);
fusione (array , start , mid , last); cost = theta(n)
Let's start by writing out a recurrence relation. You'll split the problem into subarrays of size n / 3 and 2n / 3, and then in the merge step still do linear work to combine them. That gives the recurrence
T(0) = 1
T(n) = T(n / 3) + T(2n / 3) + Θ(n)
The question now is how to solve the recurrence relation. I'm going to claim that this is Θ(n log n). To see this, we'll prove that it's Ω(n log n) and that it's O(n log n) by using the recursion tree method.
Think about expanding out this recursion using a recursion tree. Notice that
The top layer does Θ(n) work.
The next layer has a subcall of size n / 3 and a subcall of size 2n / 3, which collectively do Θ(n) work.
The layer below that has a subcall of size n / 9, a subcall of size 2n / 9, a second subcall of size 2n / 9, and a final subcall of size 4n / 9. Collectively, they do Θ(n) work.
More generally, up until the point where the n / 3 branches die off, the top layers of the tree all do Θ(n) work. The number of layers before you start to have the recursion die off is roughly log3 n, so the work done is at least Ω(n log n) due to Θ(log n) layers doing Θ(n) work.
You can also notice that the work per layer is always O(n), because the size of the subproblems is always no greater than the size of the subproblems on the previous layer (it's equal for the first few layers, then drops as those layers drop off). Therefore, an upper bound will be O(nL), where L is the total number of layers. The slowest problem to shrink shrinks by a factor of 2/3 at each layer, so there will be O(log n) total layers. This gives an upper bound of O(n log n).
Since the work is O(n log n) and Ω(n log n), it's therefore Θ(n log n).
Hope this helps!
Correct answer will be - n log3/2 n
Because it is a result of T(n) = T(n/3) + T(2n/3) + Θ(n) equation

Give an asymptotic upper bound on the height of an n-node binary search tree in which the average depth of a node is Θ(lg n)

Recently, I'm trying to solve all the exercises in CLRS. but there are some of them i can't figure out. Here is one of them, from CLRS exercise 12.4-2:
Describe a binary search tree on n nodes such that the average depth of a node in the tree is Θ(lg n) but the height of the tree is ω(lg n). Give an asymptotic upper bound on the height of an n-node binary search tree in which the average depth of a node is Θ(lg n).
Can anyone share some ideas or references to solve this problem? Thanks.
So let's suppose that we build the tree this way: given n nodes, take f(n) nodes and set them aside. Then build a tree by building a perfect binary tree where the root has a left subtree that's a perfect binary tree of n - f(n) - 1 nodes and a right subtree that's a chain of length f(n). We'll pick f(n) later.
So what's the average depth in the tree? Since we just want an asymptotic bound, let's pick n such that n - f(n) - 1 is one less than a perfect power of two, say, 2^k - 1. In that case, the sum of the heights in this part of the tree is 1*2 + 2*3 + 4*4 + 8*5 + ... + 2^(k-1) * k, which is (IIRC) about k 2^k, which is just about (n - f(n)) log (n - f(n)) by our choice of k. In the other part of the tree, the total depth is about f(n)^2. This means that the average path length is about ((n - f(n))log (n - f(n)) + f(n)^2) / n. Also, the height of the tree is f(n). So we want to maximize f(n) while keeping the average depth O(log n).
To do this, we need to find f(n) such that
n - f(n) = Θ(n), or the log term in the numerator disappears and the height isn't logarithmic,
f(n)^2 / n = O(log n), or the second term in the numerator gets too big.
If you pick f(n) = Θ(sqrt(n log n)), I think that 1 and 2 are satisfied maximally. So I'd wager (though I could be totally wrong about this) that this is as good as you can get. You get a tree of height Θ(sqrt(n log n)) that has average depth Θ(Log n).
Hope this helps! If my math is way off, please let me know. It's late now and I haven't done my usual double-checking. :-)
first maximize the height of the tree. (have a tree where each node only has one child node, so you have a long chain going downward).
Check the average depth. (obviously the average depth will be too high).
while the average depth is too high, you must decrease the height of the tree by one.
There are many ways to decrease the height of the tree by one. Choose the way which minimizes the average height. (prove by induction that each time you should select the one that minimizes the average height). Keep going until you fall under the average height requirement. (e.g. calculate using induction a formula for the height and the average depth).
If you are trying to maximize the height of a tree while minimizing the average depth of all the nodes of the tree, the unambiguous best shape would be an "umbrella" shape, e.g. a full binary tree with k nodes and height = lg k, where 0 < k < n, along with a single path, or "tail", of n-k nodes coming out of one of the leaves of the full part. The height of this tree is roughly lg k + n - k.
Now let's compute the total depth of all the nodes. The sum of the depths of the nodes of the full part is sum[ j * 2^j ], where the sum is taken from j=0 to j=lg k. By some algebra, the dominant term of the result is 2k lg k.
Next, the sum of the depths of the tail part is given by sum[i + lg k], where the sum is taken from i=0 to i=n-k. By some algebra, the result is approximately (n-k)lg k + (1/2)(n-k)^2.
Hence, summing the two parts above together and dividing by n, the average depth of all the nodes is (1 + k/n) lg k + (n-k)^2 / (2n). Note that because 0 < k < n, the first term here is O(lg n) no matter what k we choose. Hence, we need only make sure the second term is O(lg n). To do so, we require that (n-k)^2 = O(n lg n), or k = n - O(sqrt(n lg n)). With this choice, the height of the tree is
lg k + n - k = O( sqrt(n lg n) )
this is asymptotically larger than the ordinary O(lg n), and is asymptotically the tallest you can make the tree while keeping the average depth to be O(lg n)

Lower bounds on comparison sorts for a small fraction of inputs?

Can someone please walk me through mathematical part of the solution of the following problem.
Show that there is no comparison sort whose running time is linear for at least half
of the n! inputs of length n. What about a fraction of 1/n of the inputs of length n?
What about a fraction (1/(2)^n)?
Solution:
If the sort runs in linear time for m input permutations, then the height h of the
portion of the decision tree consisting of the m corresponding leaves and their
ancestors is linear.
Use the same argument as in the proof of Theorem 8.1 to show that this is impossible
for m = n!/2, n!/n, or n!/2n.
We have 2^h ≥ m, which gives us h ≥ lgm. For all the possible ms given here,
lgm = Ω(n lg n), hence h = Ω(n lg n).
In particular,
lgn!/2= lg n! − 1 ≥ n lg n − n lg e − 1
lgn!/n= lg n! − lg n ≥ n lg n − n lg e − lg n
lgn!/2^n= lg n! − n ≥ n lg n − n lg e − n
Each of these proofs are a straightforward modification of the more general proof that you can't have a comparison sort that sorts any faster than Ω(n log n) (you can see this proof in this earlier answer). Intuitively, the argument goes as follows. In order for a sorting algorithm to work correctly, it has to be able to determine what the initial ordering of the elements is. Otherwise, it can't reorder the values to put them in ascending order. Given n elements, there are n! different permutations of those elements, meaning that there are n! different inputs to the sorting algorithm.
Initially, the algorithm knows nothing about the input sequence, and it can't distinguish between any of the n! different permutations. Every time the algorithm makes a comparison, it gains a bit more information about how the elements are ordered. Specifically, it can tell whether the input permutation is in the group of permutations where the comparison yields true or in the group of permutations where the comparison yields false. You can visualize how the algorithm works as a binary tree, where each node corresponds to some state of the algorithm, and the (up to) two children of a particular node indicate the states of the algorithm that would be entered if the comparison yields true or false.
In order for the sorting algorithm to be able to sort correctly, it has to be able to enter a unique state for each possible input, since otherwise the algorithm couldn't distinguish between two different input sequences and would therefore sort at least one of them incorrectly. This means that if you consider the number of leaf nodes in the tree (parts where the algorithm has finished comparing and is going to sort), there must be at least one leaf node per input permutation. In the general proof, there are n! permutations, so there must be at least n! leaf nodes. In a binary tree, the only way to have k leaf nodes is to have height at least Ω(log k), meaning that you have to do at least Ω(log k) comparisons. Thus the general sorting lower bound is Ω(log n!) = Ω(n log n) by Stirling's approximation.
In the cases that you're considering, we're restricting ourselves to a subset of those possible permutations. For example, suppose that we want to be able to sort n! / 2 of the permutations. This means that our tree must have height at least lg (n! / 2) = lg n! - 1 = Ω(n log n). As a result. you can't sort in time O(n), because no linear function grows at the rate Ω(n log n). For the second part, seeing if you can get n! / n sorted in linear time, again the decision tree would have to have height lg (n! / n) = lg n! - lg n = Ω(n log n), so you can't sort in O(n) comparisons. For the final one, we have that lg n! / 2n = lg n! - n = Ω(n log n) as well, so again it can't be sorted in O(n) time.
However, you can sort 2n permutations in linear time, since lg 2n = n = O(n).
Hope this helps!

How do you calculate the big oh of the binary search algorithm?

I'm looking for the mathematical proof, not just the answer.
The recurrence relation of binary search is (in the worst case)
T(n) = T(n/2) + O(1)
Using Master's theorem
n is the size of the problem.
a is the number of subproblems in the recursion.
n/b is the size of each subproblem. (Here it is assumed that all subproblems are essentially the same size.)
f (n) is the cost of the work done outside the recursive calls, which includes the cost of dividing the problem and the cost of merging the solutions to the subproblems.
Here a = 1, b = 2 and f(n) = O(1) [Constant]
We have f(n) = O(1) = O(nlogba)
=> T(n) = O(nlogba log2 n)) = O(log2 n)
The proof is quite simple: With each recursion you halve the number of remaining items if you’ve not already found the item you were looking for. And as you can only divide a number n recursively into halves at most log2(n) times, this is also the boundary for the recursion:
2·2·…·2·2 = 2x ≤ n ⇒ log2(2x) = x ≤ log2(n)
Here x is also the number of recursions. And with a local cost of O(1) it’s O(log n) in total.

Resources