Prove the height of binary tree is lower-bounded by lgn - binary-tree

The following is a homework question. I don't like posting questions without having some evidence of what I tried. Unfortunately, at my University my professor is Chinese and his lectures are not easy to understand. So I went to the tutoring center. The tutors there were stumped (they're international students who never actually took this class at the University). So after two days of going to the tutor dept. and not a single tutor was able to help me, I've resulted to Stack Overflow.
The book we are using is Introductions to Algorithms
I've read through the chapter and I'm not able to figure out how to start and finish this.
Can someone please help me with this answer?

posted solution from professor:

For a binary tree of height h at level 0 it has 1 node, level 1 it has 2 nodes, level 2 it has 4 nodes, so basically at level l it has 2^l nodes, essentially you have this equation(a Geometric Progression).
number of nodes in a tree of height h = 2^0 + 2^1+...2^h-1 + 2^h = n ie 2^0(2^(h+1) - 1)/(2 - 1) ie n = 2^(h+1) - 1 which leads to n < 2^(h+1) or log(n) < h + 1 or
h + 1 > log(n) or h > log (n) ie h is lower bound on log(n).

Related

Stuck at Algorithm pseudocode generation

I do not know what to do next (and even if my approach is correct) in the following problem:
Part 1
Part 2
I have just figured out that a possible MNT (for part a) is to get a jar, test if it breaks from height h, if so then there's the answer, if not, height+1 and keep looping.
For part b is the following. Since we know max height equals n, then we start from n (current height = n). Therefore we go from top to bottom adding to our broken jar count (they are supposed to break if you start from top) until the jars stop breaking. Then the number would be current height + 1 (because we need to go back one index).
For part c, I don't even know what my approach would be, since I am assuming that the order of the algorithm is O(n^c) where c is a fraction. I also know that O(n^c) is faster than O(n).
I also noted that there is a problem similar to this one online, but it talks about rungs instead of a robotic arm. Maybe it is similar? Here is the link
Do you have any recommendations/clues? Any help will be appreciated.
Thank you for your time and help in advance.
Cheers!
This is an answer for part (c).
The idea is to find some number k and apply the following scheme:
Drop a jar from height k:
If it breaks, drop the other one from k-1 down to 1 until we find the height that it breaks in, in no more than k tries.
If it doesn't break, drop it again from height k + (k-1). Again, if it breaks drop the other one from (k+(k-1)-1) down to k+1, otherwise continue to (k + (k-1) + (k-2)).
Continue this until you find the height
(of course if at some point you need to jump to a height greater than n, you just jump to n).
This scheme ensures we'll use at most k tries. So now the question is how to find a minimal k (as a function of n), for which the scheme will work. Since, at every step, we reduce by 1 our height advancement, the following equation must hold:
k + (k-1) + (k-2) + ... + 1 >= n
Otherwise will "run out" of steps before reaching n. We want to find the smallest k for which the inequality holds.
There's a formula to the sum:
1 + 2 + ... + k = k(k+1)/2
Using that we get the equation:
k(k+1)/2 = n ===> k^2 + k - 2n = 0
Solving this (and if it's not integral take the ceiling of it) will give us k. Quadratic equations might have two solutions, but ignoring the negative one you get:
k = (-1 + sqrt(1 + 8n))/2
Looking for the complexity, we can ignore everything but the n, which has an exponent of 1/2 (since we're taking its square root). That is actually better then the requested complexity of n to power of 2/3.
For part (a) you can use binary search over height. pseudo code for the same is below :
lo = 0
hi = n
while(lo<hi) {
mid = lo +(hi-lo)/2;
if(galss_breaks(mid)) {
hi = mid-1;
} else {
lo = mid;
}
}
'lo' will contain the maximum possible height in minimum possible trials. It will take log(n) steps in worst case whereas your approach may take N steps in worst case.
For part(b) ,
you can use your approach a, start from the minimum height and increase height by 1 until the glass breaks. This will at most break 1 glass to determine the required height.

How come the height of recursion tree in merge sort lg(n)+1

I read few questions with answers as suggested by stackoveflow. I am following Introduction to Algorithm by cormen book for my self study. Everything is been explained clearly in that book but the only thing that is not explained is how to calculate height of tree in merge sort analysis.
I am still on chapter 2 haven't gone far if that is explained in later chapters.
I want to ask if the top most node is divided 2 times and so on. It gives me a height of ln(n) which is log2(n) what if i divide the main problem in five subproblems. Would it have been log5(n) then? Please explain how is this expressed in logarithm as well why not in some linear term?
Thanks
Recursion trees represent self-calls in recursive procedures. Typical mergsort calls itself twice, each call sorting half the input, so the recursion tree is a complete binary tree.
Observe that complete binary trees of increasing height display a pattern in their numbers of nodes:
height new "layer" total nodes
(h) of nodes (N)
1 1 1
2 2 3
3 4 7
4 8 15
...
Each new layer at level L has 2^L nodes (where level 0 is the root). So it's easy to see that total nodes N as a function of h is just
N = 2^h - 1
Now solve for h:
h = log_2 (N + 1)
If you have a 5-way split and merge, then each node in the tree has 5 children, not two. The table becomes:
height new "layer" total nodes
(h) of nodes (N)
1 1 1
2 5 6
3 25 31
4 125 156
...
Here we have N = (5^h - 1) / 4. Solving for h,
h = log_5 (4N + 1)
In general for a K-way merge, the tree has N = (K^h - 1) / (K - 1), so the height is given by
h = log_K ((K - 1)N + 1) = O(log N) [the log's base doesn't matter to big-O]
However, be careful. In K-way mergesort, selecting each element to merge requires Theta(log K) time. You can't ignore this cost either theoretically or in practice!

Why the total number of levels of the recursion tree of a merge sort is lg n + 1?

I think the question is pretty self-explanatory here but I am looking at "Introduction to Algorithms" 3rd edition page 37 and it says that the total # of levels of the recursion tree in Figure 2.5 is lg n + 1 but I do not understand why you have to +1. Can anyone please explain the rationale behind this? thanks
The tree should contain N leaves. A binary tree with level h( the root is level 1) has 2^(h-1) leaves at most, so we assert that 2^(h-1) >= n, that's h >= lg(n)+1. At the same time it should be a full binary tree. A full binary tree with level h will have (2^(h-2)+1) leaves at least, that's 2^(h-2)+1<=n, h<=lg(n-1)+2
When n=2^k, k+2>h>=k+1, so h=k+1=lg(n)+1, it's the case in the book.
What's more, when n!=2^k, there will be a k where 2^k>n>2(k-1),we have h>=lg(n)+1>k and h< lg(n)+2 < k+2, that's h = k+1 = ceil(lg(n)+1).
In all, k = ceil(lg(n)+1). where ceil(lg(n)+1) indicates the smallest integer which is not smaller than lg(n)+1.
Let's say N is equal to 8. then, we have 4 levels:
1. full array with size 8.
2. halves with size 4.
3. quarters with size 2.
4. eighths with size 1.
That's lg n + 1. lg 8 = 3. lg 8 + 1 = 4.
I have explained visually in detail how to calculate complexity of mergesort using recursion tree, have a look at it here

recursion tree method of solving recurrences

I was practicing the recursion tree method using this link: http://www.cs.cornell.edu/courses/cs3110/2012sp/lectures/lec20-master/lec20.html .. 1st example was okay but in the second example he calculates the height of the tree as log(base 3/2) n .. Can anyone tell me how he calculated height ? May be a dumb question but i can't understand! :|
Let me try explain it. The recursion formula you have is T(n) = T(n/3) + T(2n/3) + n. It says, you are making a recursion tree that splits into two subtrees of sizes n/3, 2n/3, and costs n at that level.
If you see the height is determined by height of largest subtree (+1). Here the right-subtree, the one with 2n/3 element will drive the height. OK?
If the above sentence is clear to you, lets calculate height. At height 1,we will have n*(2/3) elements, at height 2, we have n*(2/3)^2 elements,... we will keep splitting till we have one element left, thus at height h
n*(2/3)^h <= 1
(take log both side)
log(n) + h*log(2/3) <= 0
(log is an increasing function)
h*log(3/2) >= log(n)
h >= log(n)/log(3/2)
h >= log3/2 (n)
I would suggest reading Master Method for Recursion from Introduction to Algorithms - CLRS.

What does Logn actually mean?

I am just studying for my class in Algorithms and have been looking over QuickSort. I understand the algorithm and how it works, but not how to get the number of comparisons it does, or what logn actually means, at the end of the day.
I understand the basics, to the extent of :
x=logb(Y) then
b^x = Y
But what does this mean in terms of algorithm performance? It's the number of comparisons you need to do, I understand that...the whole idea just seems so unintelligible though. Like, for QuickSort, each level K invocation involves 2^k invocations each involving sublists of length n/2^K.
So, summing to find the number of comparisons :
log n
Σ 2^k. 2(n/2^k) = 2n(1+logn)
k=0
Why are we summing up to log n ? Where did 2n(1+logn) come from? Sorry for the vagueness of my descriptions, I am just so confused.
If you consider a full, balanced binary tree, then layer by layer you have 1 + 2 + 4 + 8 + ... vertices. If the total number of vertices in the tree is 2^n - 1 then you have 1 + 2 + 4 + 8 + ... + 2^(n-1) vertices, counting layer by layer. Now, let N = 2^n (the size of the tree), then the height of the tree is n, and n = log2(N) (the height of the tree). That's what the log(n) means in these Big O expressions.
below is a sample tree:
1
/ \
2 3
/ \ / \
4 5 6 7
number of nodes in tree is 7 but high of tree is log 7 = 3, log comes when you have divide and conquer methods, in quick sort you divide list into 2 sublist, and continue this until rich small lists, divisions takes logn time (in average case), because the high of division is log n, partitioning in each level takes O(n), because in each level in average you partition N numbers, (may be there are too many list for partitioning, but average number of numbers is N in each level, in fact some of count of lists is N). So for simple observation if you have balanced partition tree you have log n time for partitioning, which means high of tree.
1 forget about b-trees for sec
here' math : log2 N = k is same 2^k=N .. its the definion of log
, it could be natural log(e) N = k aka e^k = n,, or decimal log10 N = k is 10^k = n
2 see perfect , balanced tree
1
1+ 1
1 + 1 + 1+ 1
8 ones
16 ones
etc
how many elements? 1+2+4+8..etc , so for 2 level b-tree there are 2^2-1 elements, for 3 level tree 2^3-1 and etc.. SO HERE'S MAGIC FORMULA: N_TREE_ELEMENTS= number OF levels^ 2 -1 ,or using definition of log : log2 number OF levels= number_of_tree_elements (Can forget about -1 )
3 lets say there's a task to find element in N elements b-tree, w/ K levels (aka height)
where how b-tree is constructed log2 height = number_of_tree elements
MOST IMPORTANT
so by how b-tree is constructed you need no more then 'height' OPERATIONS to find element in all N elements , or less.. so WHAT IS HEIGHT equals : log2 number_of_tree_elements..
so you need log2 N_number_of_tree_elements.. or log(N) for shorter
To understand what O(log(n)) means you might wanna read up on Big O notaion. In shot it means, that if your data set gets 1024 times bigger you runtime will only be 10 times longer (or less)(for base 2).
MergeSort runs in O(n*log(n)), which means it will take 10 240 times longer. Bubble sort runs in O(n^2), which means it will take 1024^2 = 1 048 576 times longer. So there are really some time to safe :)
To understand your sum, you must look at the mergesort algorithm as a tree:
sort(3,1,2,4)
/ \
sort(3,1) sort(2,4)
/ \ / \
sort(3) sort(1) sort(2) sort(4)
The sum iterates over each level of the tree. k=0 it the top, k= log(n) is the buttom. The tree will always be of height log2(n) (as it a balanced binary tree).
To do a little math:
Σ 2^k * 2(n/2^k) =
2 * Σ 2^k * (n/2^k) =
2 * Σ n*2^k/2^k =
2 * Σ n =
2 * n * (1+log(n)) //As there are log(n)+1 steps from 0 to log(n) inclusive
This is of course a lot of work to do, especially if you have more complex algoritms. In those situations you get really happy for the Master Theorem, but for the moment it might just get you more confused. It's very theoretical so don't worry if you don't understand it right away.
For me, to understand issues like this, this is a good way to think about it.

Resources