Why O(N Log N) to build Binary search tree? - binary-tree

Getting ready for exam. This is not the homework question.
I figured that the worst case O(N^2) to build BST. (each insert req N-1 comparison, you sum all the comparisons 0 + 1 + ... + N-1 ~ N^2). This is the case for skewed BST.
The insertion for (balanced) BST is O(log N), so why the best case is O(N logN) to construct the tree ?
My guess best guess - since single insertion is log N, than summing all the insertions somehow gives us N log.
Thanks !

As you wrote :) Single insertion is O(log N).Because the weighted tree height of N element is log N you need up to log N comparsions to insert single element. You need to do N of these insertions. So N*logN.

Related

What is the complexity of binary search tree to sort n elements

We are given BST with n nodes which are integer. What will would be complexity to print all integers in sorted order?
Can anyone help?
To print in sorted order you would have to go through an in-order traversal of the graph, which is a depth-first traversal.
The time complexity of that would be O(n + m) where n is the number of nodes and m is the number of edges.
Since this is a BST, maximum number of edges would be n - 1 hence the time complexity would be O(n + n - 1) = O(n).

finding the complexity of insertion to avl tree

If I have an empty avl tree and I want to insert a set of ordered numbers (1, 2, ... k), why the complexity is O(k).
thank you
It's more of a math question, so here is the deal
AVL tree has a time complexity of log(n) for inserting an element with n nodes inside the tree
so from your question, with a set of number (1,2,3,...,k) you wanted to insert, the time complexity would be like this
summation from i=1 to i=k of log(i) (i.e. log1 + log2 + log3 + ... + logk)
which is equals to
log(k!)
which is approximately equals to
k*log(k) (By using Stirling's approximation)
So to answer your question, it should be O(k log k) instead of O(k)

Specific algorithm sorting n elements with m distinct values

I am going through exercies for an exam in algorithm analysis and this is one of them:
Present an algorithm that takes as input a list of n elements (that
are comparable) and sorts them in O(n log m) time, where m is the
number of distinct values in the input list.
I have read about the common sorting algorithms and I really can't come up with a solution.
Thanks for your help
You can build an augmented balanced binary search tree on the n elements. The augmented info stored at each node would be it's frequency. You build this structure with n insertions into the tree, the time to do this would be O(n lg m), since there would be only m nodes. Then you do a in-order traversal of this tree: visit the left subtree, then print the element stored at the root f times where f is it's frequency (this was the augmented info) and finally visit the right subtree. This traversal would take time O(n + m). So, the running time of this simple procedure would be O(n lg m + n + m) = O(n lg m) since m <= n.

Time complexity for generating binary heap from unsorted array

Can any one explain why the time complexity for generating a binary heap from a unsorted array using bottom-up heap construction is O(n) ?
(Solution found so far: I found in Thomas and Goodrich book that the total sum of sizes of paths for internal nodes while constructing the heap is 2n-1, but still don't understand their explanation)
Thanks.
Normal BUILD-HEAP Procedure for generating a binary heap from an unsorted array is implemented as below :
BUILD-HEAP(A)
heap-size[A] ← length[A]
for i ← length[A]/2 downto 1
do HEAPIFY(A, i)
Here HEAPIFY Procedure takes O(h) time, where h is the height of the tree, and there
are O(n) such calls making the running time O(n h). Considering h=lg n, we can say that BUILD-HEAP Procedure takes O(n lg n) time.
For tighter analysis, we can observe that heights of most nodes are small.
Actually, at any height h, there can be at most CEIL(n/ (2^h +1)) nodes, which we can easily prove by induction.
So, the running time of BUILD-HEAP can be written as,
lg n lg n
∑ n/(2^h+1)*O(h) = O(n* ∑ O(h/2^h))
h=0 h=0
Now,
∞
∑ k*x^k = X/(1-x)^2
k=0
∞
Putting x=1/2, ∑h/2^h = (1/2) / (1-1/2)^2 = 2
h=0
Hence, running time becomes,
lg n ∞
O(n* ∑ O(h/2^h)) = O(n* ∑ O(h/2^h)) = O(n)
h=0 h=0
So, this gives a running time of O(n).
N.B. The analysis is taken from this.
Check out wikipedia:
Building a heap:
A heap could be built by successive insertions. This approach requires O(n log n) time because each insertion takes O(log n) time and there are n elements. However this is not the optimal method. The optimal method starts by arbitrarily putting the elements on a binary tree, respecting the shape property. Then starting from the lowest level and moving upwards, shift the root of each subtree downward as in the deletion algorithm until the heap property is restored.
http://en.wikipedia.org/wiki/Binary_heap

Balanced Search Tree Query, Asymtotic Analysis

The situation is as follows:-
We have n number and we have print
them in sorted order. We have access
to balanced dictionary data structure,
which supports the operations serach,
insert, delete, minimum, maximum each
in O(log n) time.
We want to retrieve the numbers in
sorted order in O(n log n) time using
only the insert and in-order
traversal.
The answer to this is:-
Sort()
initialize(t)
while(not EOF)
read(x)
insert(x,t);
Traverse(t);
Now the query is if we read the elements in time "n" and then traverse the elements in "log n"(in-order traversal) time,, then the total time for this algorithm (n+logn)time, according to me.. Please explain the follow up of this algorithm for the time calculation.. How it will sort the list in O(nlogn) time??
Thanks.
Each insert is O(log n). You are doing n inserts, so that gives n * O(log n) = O(n log n) asymptotic time complexity. Traversing the tree is O(n), because there are n nodes. That adds up to O(n + n log n) which differs from O(n log n) by a constant, so the final asymptotic complexity is O(n log n)..
and then traverse the elements in "log n
Traversal is O(n), not O(log n). An insertion is O(log n), and you're doing n such insertions.

Resources