Let's say we're given with a MAX Heap and we want to delete any of the leaf node, then how much time will it take to delete any of the leaf node and maintain the max heap property?
My main doubt is - will it O(n) time to reach to leaf nodes?
Also, why Binary Heaps has to be a complete Binary Tree and not almost complete Binary tree?
A binary heap is a complete binary tree. All levels are full, except possibly the last, which is left-filled. A binary tree is not necessarily a full binary tree.
In a binary heap of size N, represented in an array, the leaf nodes are in the last half of the array. That is, the nodes from N/2 to N-1 are leaf nodes. Deleting the last node (i.e. a[N-1]) is an O(1) operation: all you have to do is remove the node and decrease the size of the heap.
Removing any other leaf node is potentially an O(log n) operation because you have to:
Move the last node, a[N-1] to the node that you're deleting.
Bubble that item up into the heap, to its proper position.
The first part is, of course, O(1). The second part can require up to log(n) - 1 moves. The average is less than 2, but the worst case is log(n) - 1.
In a MAX heap you can access the leaf node in the heap in O(logn) as it is a complete binary tree and traversing the entire height of the tree takes O(logn)
Once this is done, you can call heapify to build the heap again which takes O(logn)
Almost Complete Binary Tree is no different from Complete Binary Tree except that it has following two restrictions :
At every node after completion of current level only go to next level.
At every node after completion of left node go to right.
Every formula that is applicable to complete binary tree will be applicable to almost complete binary tree.
The only difference is there is a gap at last level from right to left in almost complete binary tree. If there is no gap then it is Complete Binary Tree.
Heap is forced to have this property of being a compete binary tree for effciency purposes
Related
I have an one lecture slides says following:
To find middle element in AVL tree, I traverse elements in order until It reaches the moddile element. It takes O(N).
If I know correctly, in tree structure, finding element takes base 2 O(logn) since AVL is binary tree that always divided into 2 childs.
But why it says O(N)?
I am just trying to elaborate 'A. Mashreghi' comment.
Since, the tree under consideration is AVL tree - the guaranteed finding of element in O(log n) holds as log as you have the element(key) to find.
The problem is - you are trying to identify a middle element in the given data structure. As it is AVL tree (self balanced BST) in-order travel gives you elements in ascending order. You want to use this property to find the middle element.
Algorithm goes like - have a counter increment for every node traversed in-order and return # n/2th position. This sums to O(n/2) and hence the overall complexity O(n).
Being divided into 2 children does not guarantee perfect symmetry. For instance, consider the most unbalanced of all balanced binary trees: each right child has a depth one more than its corresponding left child.
In such a tree, the middle element will be somewhere down in the right branch's left branch's ...
You need to determine how many nodes N you have, then locate the N/2th largest node. This is not O(log N) process.
I am having trouble with these questions:
A binary tree with N nodes is at least how deep?
How deep is it at most?
Would the maximum depth just be N?
There are two extremes that you need to consider.
Every node has just a left(or right) child, but not right child. In which case your binary search tree is merely a linkedlist in practice.
Every level in your tree is full, maybe except the last level. This type of trees are called complete.
Third type of tree that I know may not be relevant to your question. But it is called full tree and every node is either a leaf or has n number of childs for an n-ary tree.
So to answer your question. Max depth is N. And at least it has log(N) levels, when it is a complete tree.
Given a very large binary tree (i.e. with millions of nodes), how to handle determining the number of nodes in the tree? In other words, given the root node of this tree to a function, the function should return the number of nodes in the tree.
Or let's say how do you check if the Binary Tree is BST if the tree has very large number of nodes?
Walk all nodes and check whatever conditions/metric you need. There is nothing else you can do without additional knowledge about the tree.
You can enforce particular conditions at the time when tree is created (i.e. must be balanced/sorted/whatever) or collect information about tree at creation time (i.e. store and constantly update number of children).
To check if it's a VALID bst you have to visit every node depth first and ensure each node is smaller than the previous.
If you want to evaluate how long that will take for a balanced BST you could get a quick approximation of the size by counting the length of one leg, I believe the total size will be between 2^(n-1) and 2^n-1 inclusive
For deleting a node in the binary tree, we have to search the node. That is possible in minimum O(log N) and max O(N). Depending on the node, we have to rearrange the pointers. How do we calculate the time complexity of that.
That depends on how you're doing the deletion. The most common way involves finding the successor of the node, then replacing the node with that successor. This can be done in O(h), where h is the height of the tree. In the worst case this is O(n), but in a balanced tree is worst-case O(lg n).
Yes best case complexity is O(logn) (when perfectly balanced) and worst case complexity is O(n) 1 - 2 - 3 - 4
But the main problem with BST deletion (Hibbard Deletion) is that It is not symmetric. After many insertion and deletion BST become less balance. Researchers proved that after sufficiently long number of random insert and delete height of the tree becomes sqrt(n) . so now every operation (search, insert, delete) will take sqrt(n) time which is not good compare to O(logn) .
This is very long standing(around 50 years) open problem to efficient symmetric delete for BST. for guaranteed balanced tree, we have to use RedBlack Tree etc.
Where are you getting the "worst search time as max O(N)"? That should never happen in a BST. At worst, it should be max O(h) for search and delete, where 'h' is the height of the tree. See this helpful article.
Most of the complexity is searching for the node. Once it has been found—as long as the parent node is retained—it is only a few more assignments to delete the node. So it is a constant order.
In case of deletion, there are two factors which affects the cost of deletion of node:
Locating the parent node of target node / traversing till the parent node of the target node:
This might so happen that the pointer to the target node is given but the pointer to parent node of target node is not given, keep in mind that it is the parent node which will (after the deletion of target node) have to point to the in-order successor or predecessor of the target node. Hence, if the pointer to parent node is not given you will have to traverse till you find the parent node of target node.
The best case to find the parent node of target node is O(log n) for a balanced tree.
The worst case to find the parent node of target node is O(n) for a skewed tree.
The adjustment or rearrangement of tree after deletion of target node:
During the adjustment, the cost involved in finding the in-order predecessor or successor of the target node is taken in account, since after finding the in-order predecessor or successor of target node, we only have to point parent node of target node to the in-order predecessor or successor of the target node which will take constant amount of time.
In best case, cost involved will be O(1), if both child of target node is leaf node, since in that case,
Left node -> in-order predecessor
Right node -> in-order successor
Note:
Even if one child is not leaf node, then we cannot consider it as best case since, wether we go for searching in-order successor or predecessor is implementation dependent.
In worst case, both child of target node is skewed tree, hence, O(n).
Now if you see,
For factor 1, best case : O(log n)
For factor 2, best case : O(1)
Overall : O(log n) + O(1) = O(log n)
For factor 1, worst case : O(n)
For factor 2, worst case : O(n)
Overall : O(n) + O(n) = O(n)
I want to sum all the values in the leaves of a BST. Apparently, I can't get to the leaves without traversing the whole tree. Is this true? Can I get to the leaves without taking O(N) time?
You realize that the leaves themselves will be at least 1/2 of O(n) anyway?
There is no way to get the leaves of a tree without traversing the whole tree (especially if you want every single leaf), which will unfortunately operate in O(n) time. Are you sure that a tree is the best way to store your data if you want to access all of these leaves? There are other data structures which will allow more efficient access to your data.
To access all leaf nodes of a BST, you will have to traverse all the nodes of BST and that would be of order O(n).
One alternative is to use B+ tree where you can traverse to a leaf node in O(log n) time and after that all leaf nodes can be accessed sequentially to compute the sum. So, in your case it would be O(log n + k), where k is the number of leaf nodes and n is the total number of nodes in the B+ tree.
cheers
You will either have to traverse the tree searching for nodes without children, or modify the structure you are using to represent the tree to include a list of the leaf nodes. This will also necessitate modifying your insert and delete methods to maintain the list (for instance, if you remove the last child from a node, it becomes a leaf node). Unless the tree is very large, it's probably nice enough to just go ahead and traverse the tree.