How is that a binary tree with n! leaves has height omega (n log n) - algorithm

I came across this proposition that a binary tree with n! leaves has height omega(n log n).
I am unable to understand how it is possible. I understand that height of a binary tree with n nodes is log n <= h <= n, i.e the height is at least log n (in case of complete binary tree), but I do not see a hint as to how the above proposition could be true or proved correct.
Any suggestions?

You have already stated that the lower bound for a binary tree with n nodes is log n. It is a well known fact (Stirlings formula), that log(n!) is approximately n log n. See for example here for a derivation.
A tree with n! leaves and minimal height has approximately 2n! nodes. This gives log(2n!) = log 2 + log(n!) approximately log 2 + n log n which is in omega(n log n)

Related

Dijkstra's: Where is the equation from? m < n^2/log n

In this passage from my textbook:
where are the inequalities from? (The ones that I've marked with red rectangles.) I feel that they describe a relationship between vertices and edges in a graph, but I don't understand it.
You have two implementations of Dijkstra’s algorithm to choose from. One runs in time O((m + n) log n) = O(m log n), assuming the graph is connected. The other runs in time O(n2). The question is where the crossover point is between these two runtimes. Equating and simplifying gives that
m log n = n2
m = n2 / log n
So if m is asymptotically smaller than n2 / log n, you’d prefer the heap implementation, and if m is asymptotically bigger than n2 / log n you’d prefer the unsorted sequence approach.
(Note that, with a Fibonacci heap, the runtime of Dijkstra’s algorithm is O(m + n log n), which is never asymptotically worse than O(n2).)

Time complexity for generating binary heap from unsorted array

Can any one explain why the time complexity for generating a binary heap from a unsorted array using bottom-up heap construction is O(n) ?
(Solution found so far: I found in Thomas and Goodrich book that the total sum of sizes of paths for internal nodes while constructing the heap is 2n-1, but still don't understand their explanation)
Thanks.
Normal BUILD-HEAP Procedure for generating a binary heap from an unsorted array is implemented as below :
BUILD-HEAP(A)
heap-size[A] ← length[A]
for i ← length[A]/2 downto 1
do HEAPIFY(A, i)
Here HEAPIFY Procedure takes O(h) time, where h is the height of the tree, and there
are O(n) such calls making the running time O(n h). Considering h=lg n, we can say that BUILD-HEAP Procedure takes O(n lg n) time.
For tighter analysis, we can observe that heights of most nodes are small.
Actually, at any height h, there can be at most CEIL(n/ (2^h +1)) nodes, which we can easily prove by induction.
So, the running time of BUILD-HEAP can be written as,
lg n lg n
∑ n/(2^h+1)*O(h) = O(n* ∑ O(h/2^h))
h=0 h=0
Now,
∞
∑ k*x^k = X/(1-x)^2
k=0
∞
Putting x=1/2, ∑h/2^h = (1/2) / (1-1/2)^2 = 2
h=0
Hence, running time becomes,
lg n ∞
O(n* ∑ O(h/2^h)) = O(n* ∑ O(h/2^h)) = O(n)
h=0 h=0
So, this gives a running time of O(n).
N.B. The analysis is taken from this.
Check out wikipedia:
Building a heap:
A heap could be built by successive insertions. This approach requires O(n log n) time because each insertion takes O(log n) time and there are n elements. However this is not the optimal method. The optimal method starts by arbitrarily putting the elements on a binary tree, respecting the shape property. Then starting from the lowest level and moving upwards, shift the root of each subtree downward as in the deletion algorithm until the heap property is restored.
http://en.wikipedia.org/wiki/Binary_heap

Give an asymptotic upper bound on the height of an n-node binary search tree in which the average depth of a node is Θ(lg n)

Recently, I'm trying to solve all the exercises in CLRS. but there are some of them i can't figure out. Here is one of them, from CLRS exercise 12.4-2:
Describe a binary search tree on n nodes such that the average depth of a node in the tree is Θ(lg n) but the height of the tree is ω(lg n). Give an asymptotic upper bound on the height of an n-node binary search tree in which the average depth of a node is Θ(lg n).
Can anyone share some ideas or references to solve this problem? Thanks.
So let's suppose that we build the tree this way: given n nodes, take f(n) nodes and set them aside. Then build a tree by building a perfect binary tree where the root has a left subtree that's a perfect binary tree of n - f(n) - 1 nodes and a right subtree that's a chain of length f(n). We'll pick f(n) later.
So what's the average depth in the tree? Since we just want an asymptotic bound, let's pick n such that n - f(n) - 1 is one less than a perfect power of two, say, 2^k - 1. In that case, the sum of the heights in this part of the tree is 1*2 + 2*3 + 4*4 + 8*5 + ... + 2^(k-1) * k, which is (IIRC) about k 2^k, which is just about (n - f(n)) log (n - f(n)) by our choice of k. In the other part of the tree, the total depth is about f(n)^2. This means that the average path length is about ((n - f(n))log (n - f(n)) + f(n)^2) / n. Also, the height of the tree is f(n). So we want to maximize f(n) while keeping the average depth O(log n).
To do this, we need to find f(n) such that
n - f(n) = Θ(n), or the log term in the numerator disappears and the height isn't logarithmic,
f(n)^2 / n = O(log n), or the second term in the numerator gets too big.
If you pick f(n) = Θ(sqrt(n log n)), I think that 1 and 2 are satisfied maximally. So I'd wager (though I could be totally wrong about this) that this is as good as you can get. You get a tree of height Θ(sqrt(n log n)) that has average depth Θ(Log n).
Hope this helps! If my math is way off, please let me know. It's late now and I haven't done my usual double-checking. :-)
first maximize the height of the tree. (have a tree where each node only has one child node, so you have a long chain going downward).
Check the average depth. (obviously the average depth will be too high).
while the average depth is too high, you must decrease the height of the tree by one.
There are many ways to decrease the height of the tree by one. Choose the way which minimizes the average height. (prove by induction that each time you should select the one that minimizes the average height). Keep going until you fall under the average height requirement. (e.g. calculate using induction a formula for the height and the average depth).
If you are trying to maximize the height of a tree while minimizing the average depth of all the nodes of the tree, the unambiguous best shape would be an "umbrella" shape, e.g. a full binary tree with k nodes and height = lg k, where 0 < k < n, along with a single path, or "tail", of n-k nodes coming out of one of the leaves of the full part. The height of this tree is roughly lg k + n - k.
Now let's compute the total depth of all the nodes. The sum of the depths of the nodes of the full part is sum[ j * 2^j ], where the sum is taken from j=0 to j=lg k. By some algebra, the dominant term of the result is 2k lg k.
Next, the sum of the depths of the tail part is given by sum[i + lg k], where the sum is taken from i=0 to i=n-k. By some algebra, the result is approximately (n-k)lg k + (1/2)(n-k)^2.
Hence, summing the two parts above together and dividing by n, the average depth of all the nodes is (1 + k/n) lg k + (n-k)^2 / (2n). Note that because 0 < k < n, the first term here is O(lg n) no matter what k we choose. Hence, we need only make sure the second term is O(lg n). To do so, we require that (n-k)^2 = O(n lg n), or k = n - O(sqrt(n lg n)). With this choice, the height of the tree is
lg k + n - k = O( sqrt(n lg n) )
this is asymptotically larger than the ordinary O(lg n), and is asymptotically the tallest you can make the tree while keeping the average depth to be O(lg n)

Lower bounds on comparison sorts for a small fraction of inputs?

Can someone please walk me through mathematical part of the solution of the following problem.
Show that there is no comparison sort whose running time is linear for at least half
of the n! inputs of length n. What about a fraction of 1/n of the inputs of length n?
What about a fraction (1/(2)^n)?
Solution:
If the sort runs in linear time for m input permutations, then the height h of the
portion of the decision tree consisting of the m corresponding leaves and their
ancestors is linear.
Use the same argument as in the proof of Theorem 8.1 to show that this is impossible
for m = n!/2, n!/n, or n!/2n.
We have 2^h ≥ m, which gives us h ≥ lgm. For all the possible ms given here,
lgm = Ω(n lg n), hence h = Ω(n lg n).
In particular,
lgn!/2= lg n! − 1 ≥ n lg n − n lg e − 1
lgn!/n= lg n! − lg n ≥ n lg n − n lg e − lg n
lgn!/2^n= lg n! − n ≥ n lg n − n lg e − n
Each of these proofs are a straightforward modification of the more general proof that you can't have a comparison sort that sorts any faster than Ω(n log n) (you can see this proof in this earlier answer). Intuitively, the argument goes as follows. In order for a sorting algorithm to work correctly, it has to be able to determine what the initial ordering of the elements is. Otherwise, it can't reorder the values to put them in ascending order. Given n elements, there are n! different permutations of those elements, meaning that there are n! different inputs to the sorting algorithm.
Initially, the algorithm knows nothing about the input sequence, and it can't distinguish between any of the n! different permutations. Every time the algorithm makes a comparison, it gains a bit more information about how the elements are ordered. Specifically, it can tell whether the input permutation is in the group of permutations where the comparison yields true or in the group of permutations where the comparison yields false. You can visualize how the algorithm works as a binary tree, where each node corresponds to some state of the algorithm, and the (up to) two children of a particular node indicate the states of the algorithm that would be entered if the comparison yields true or false.
In order for the sorting algorithm to be able to sort correctly, it has to be able to enter a unique state for each possible input, since otherwise the algorithm couldn't distinguish between two different input sequences and would therefore sort at least one of them incorrectly. This means that if you consider the number of leaf nodes in the tree (parts where the algorithm has finished comparing and is going to sort), there must be at least one leaf node per input permutation. In the general proof, there are n! permutations, so there must be at least n! leaf nodes. In a binary tree, the only way to have k leaf nodes is to have height at least Ω(log k), meaning that you have to do at least Ω(log k) comparisons. Thus the general sorting lower bound is Ω(log n!) = Ω(n log n) by Stirling's approximation.
In the cases that you're considering, we're restricting ourselves to a subset of those possible permutations. For example, suppose that we want to be able to sort n! / 2 of the permutations. This means that our tree must have height at least lg (n! / 2) = lg n! - 1 = Ω(n log n). As a result. you can't sort in time O(n), because no linear function grows at the rate Ω(n log n). For the second part, seeing if you can get n! / n sorted in linear time, again the decision tree would have to have height lg (n! / n) = lg n! - lg n = Ω(n log n), so you can't sort in O(n) comparisons. For the final one, we have that lg n! / 2n = lg n! - n = Ω(n log n) as well, so again it can't be sorted in O(n) time.
However, you can sort 2n permutations in linear time, since lg 2n = n = O(n).
Hope this helps!

recursion tree and binary tree cost calculation

I've got the following recursion:
T(n) = T(n/3) + T(2n/3) + O(n)
The height of the tree would be log3/2 of 2. Now the recursion tree for this recurrence is not a complete binary tree. It has missing nodes lower down. This makes sense to me, however I don't understand how the following small omega notation relates to the cost of all leaves in the tree.
"... the total cost of all leaves would then be Theta (n^log3/2 of 2) which, since log3/2 of 2 is a constant strictly greater then 1, is small omega(n lg n)."
Can someone please help me understand how the Theta(n^log3/2 of 2) becomes small omega(n lg n)?
OK, to answer your explicit question about why n^(log_1.5(2)) is omega(n lg n):
For all k > 1, n^k grows faster than n lg n. (Powers grow faster than logs eventually). Therefore since 2 > 1.5, log_1.5(2) > 1, and thus n^(log_1.5(2)) grows faster than n lg n. And since our function is in Theta(n^(log_1.5(2))), it must also be in omega(n lg n)

Resources