Min-Heap to Max-Heap, Comparison - algorithm

I want to Find Maximum number of comparison when convert min-heap to max-heap with n node. i think convert min-heap to max-heap with O(n). it means there is no way and re-create the heap.

As a crude lower bound, given a tree with the (min- or max-) heap property, we have no prior idea about how the values at the leaves compare to one another. In a max heap, the values at the leaves all may be less than all values at the interior nodes. If the heap has the topology of a complete binary tree, then even finding the min requires at least roughly n/2 comparisons, where n is the number of tree nodes.

If you have a min-heap of known size then you can create a binary max-heap of its elements by filling an array from back to front with the values obtained by iteratively deleting the root node from the min-heap until it is exhausted. Under some circumstances this can even be done in place. Using the rule that the root node is element 0 and the children of node i are elements 2i and 2i+1, the (max-) heap condition will automatically be satisfied for the heap represented by the new array.
Each deletion from a min-heap of size m requires up to log(m) element comparisons to restore the heap condition, however. I think that adds up to O(n log n) comparisons for the whole job. I am doubtful that you can do it any with any lower complexity without adding conditions. In particular, if you do not perform genuine heap deletions (incurring the cost of restoring the heap condition), then I think you incur comparable additional costs to ensure that you end up with a heap in the end.

Related

Implementing priority queue using max heap vs balanced BST

Balanced BST and max heap both perform insert and delete in O(logn). However, finding max value in a max heap is O(1) but this is O(logn) in balanced BST.
If we remove the max value in a max heap it takes O(logn) because it is a delete operation.
In balanced BST, deleting the max element = finding max value + delete; it equals logn + logn reduces to O(logn). So even deleting the max value in balanced BST is O(logn).
I have read one such application of max heap is a priority queue and its primary purpose is to remove the max value for every dequeue operation. If deleting max element is O(logn) for both max heap and balanced BST, I have the following questions
What is the purpose of a max heap in the priority queue just because it is easy to implement rather than using full searchable balanced BST?
Since there is no balancing factor calculation, the max heap can be called an unbalanced binary tree?
Every balanced BST can be used as a priority queue and which is also searchable in O(logn) however max heap search is O(n) correct?
All the time complexities are calculated for worst-case. Any help is greatly appreciated.
What is the purpose of a max heap in the priority queue just because it is easy to implement rather than using full searchable balanced BST?
Some advantages of a heap are:
Given an unsorted input array, a heap can still be built in O(n) time, while a BST needs O(nlogn) time.
If the initial input is an array, that same array can serve as heap, meaning no extra memory is needed for it. Although one could think of ways to create a BST using the data in-place in the array, it would be quite odd (for primitive types) and give more processing overhead. A BST is usually created from scratch, copying the data into the nodes as they are created.
Interesting fact: a sorted array is also a heap, so if it is known that the input is sorted, nothing needs to be done to build the heap.
A heap can be stored as an array without the need of storing cross references, while a BST usually consists of nodes with left & right references. This has at least two consequences:
The memory used for a BST is about 3 times greater than for a heap.
Although several operations have the same time complexity for both heap and BST, the overhead for adapting a BST is much greater, so that the actual time spent on these operations is a (constant) factor greater in the BST case.
Since there is no balancing factor calculation, the max heap can be called an unbalanced binary tree?
A heap is in fact a complete binary tree, so it is always as balanced as it can be: the leaves will always be positioned in the last or one-but-last level. A self-balancing BST (like AVL, red-black,...) cannot beat that high level of balancing, where you will often have leaves occurring at three levels or even more.
Every balanced BST can be used as a priority queue and which is also searchable in O(logn) however max heap search is O(n) correct?
Yes, this is true. So if the application needs the search feature, then a BST is superior.
What is the purpose of a max heap in the priority queue just because it is easy to implement rather than using full searchable balanced BST?
Nope. Max heap fits better, since it is carefully instrumented to return next (respecting priority) element ASAP, in O(1) time. That's what you want from the simplest possible priority queue.
Since there is no balancing factor calculation, the max heap can be called an unbalanced binary tree?
Nope. There is a balance as well. Long story short, balancing a heap is done by shift-up or shift-down operations (swapping elements which are out of order).
Every balanced BST can be used as a priority queue and which is also searchable in O(logn) however max heap search is O(n) correct?
Yeah! As well as linked list could be used or array. It is just gonna be more expensive in terms of O-notation and much slower on practice.

Advantages of heaps over binary trees in the Dijkstra algorithm

One standard implementation of the Dijkstra algorithm uses a heap to store distances from the starting node S to all unexplored nodes. The argument for using a heap is that we can efficiently pop the minimum distance from it, in O(log n). However, to maintain the invariant of the algorithm, one also needs to update some of the distances in the heap. This involves:
popping non-min elements from the heaps
computing the updated distances
inserting them back into the heap
I understand that popping non-min elements from a heap can be done in O(log n) if one knows the location of that element in the heap. However, I fail to understand how one can know this location in the case of the Dijkstra algorithm. It sounds like a binary search tree would be more appropriate.
More generally, my understanding is that the only thing that a heap can do better than a balanced binary search tree is to access (without removing) the min element. Is my understanding correct?
However, I fail to understand how one can know this location in the case of the Dijkstra algorithm.
You need an additional array that keeps track of where in the heap the elements live, or an extra data member inside the heap's elements. This has to be updated after each heap operation.
the only thing that a heap can do better than a balanced binary search tree is to access (without removing) the min element
Even a BST can be amended to keep a pointer to the min element in addition to the root pointer, giving O(1) access to the min (effectively amortizing the O(lg n) work over the other operations).
The only advantage of heaps in terms of worst-case complexity is the "heapify" algorithm, which turns an array into a heap by reshuffling its elements in-place, in linear time. For Dijkstra's, this doesn't matter, since it's going to do n heap operations of O(lg n) cost apiece anyway.
The real reason for heaps, then, is constants. A properly implemented heap is just a contiguous array of elements, while a BST is a pointer structure. Even when a BST is implemented inside an array (which can be done if the number of elements is known from the start, as in Dijkstra's), the pointers take up more memory, and navigating them takes more time than the integer operations that are used to navigate a heap.

Why siftDown is better than siftUp in heapify?

To build a MAX heap tree, we can either siftDown or siftUp, by sifting down we start from the root and compare it to its two children, then we replace it with the larger element of the two children, if both children are smaller then we stop, otherwise we continue sifting that element down until we reach a leaf node (or of course again, until that element is larger that both of its children).
Now we will only need to do that n/2 times, because the number of leaves is n/2, and the leaves will satisfy the heap property when we finish heapifying the last element on the level before the last (before the leaves) - so we will be left with n/2 elements to heapify.
Now if we use siftUp, we will start with the leaves, and eventually we will need to heapify all n elements.
My question is: when we use siftDown, aren't we basically doing two comparisons (comparing the element to its both children), instead of only one comparison when using siftUp, since we only compare that element to its one parent? If yes, wouldn't that mean that we're doubling the complexity and really ending up with the same exact complexity as sifting down?
Actually, building a heap with repeated calls of siftDown has a complexity of O(n) whereas building it with repeated calls of siftUp has a complexity of O(nlogn).
This is due to the fact that when you use siftDown, the time taken by each call decreases with the depth of the node because these nodes are closer to the leaves. When you use siftUp, the number of swaps increases with the depth of the node because if you are at full depth, you may have to swap all the way to the root. As the number of nodes grows exponentially with the depth of the tree, using siftUp gives a more expensive algorithm.
Moreover, if you are using a Max-heap to do some sort of sorting where you pop the max element of the heap and then reheapify it, it's easier to do so by using siftDown. You can reheapify in O(logn) time by popping the max element, putting the last element at the root node (which was empty because you popped it) and then sifting it down all the way back to its correct spot.

O(klogk) time algorithm to find kth smallest element from a binary heap

We have an n-node binary heap which contains n distinct items (smallest item at the root). For a k<=n, find a O(klogk) time algorithm to select kth smallest element from the heap.
O(klogn) is obvious, but couldn't figure out a O(klogk) one. Maybe we can use a second heap, not sure.
Well, your intuition was right that we need extra data structure to achieve O(klogk) because if we simply perform operations on the original heap, the term logn will remain in the resulting complexity.
Guessing from the targeted complexity O(klogk), I feel like creating and maintaining a heap of size k to help me achieve the goal. As you may be aware, building a heap of size k in top-down fashion takes O(klogk), which really reminds me of our goal.
The following is my try (not necessarily elegant or efficient) in an attempt to attain O(klogk):
We create a new min heap, initializing its root to be the root of the original heap.
We update the new min heap by deleting the current root and inserting the two children of the current root in the original heap. We repeat this process k times.
The resulting heap will consist of k nodes, the root of which is the kth smallest element in the original heap.
Notes: Nodes in the new heap should store indexes of their corresponding nodes in the original heap, rather than the node values themselves. In each iteration of step 2, we really add a net of one more node into the new heap (one deleted, two inserted), k iterations of which will result in our new heap of size k. During the ith iteration, the node to be deleted is the ith smallest element in the original heap.
Time Complexity: in each iteration, it takes O(3logk) time to delete one element from and insert two into the new heap. After k iterations, it is O(3klogk) = O(klogk).
Hope this solution inspires you a bit.
Assuming that we're using a minheap, so that a root node is always smaller than its children nodes.
Create a sorted list toVisit, which contains the nodes which we will traverse next. This is initially just the root node.
Create an array smallestNodes. Initially this is empty.
While length of smallestNodes < k:
Remove the smallest Node from toVisit
add that node to smallestNodes
add that node's children to toVisit
When you're done, the kth smallest node is in smallestNodes[k-1].
Depending on the implementation of toVisit, you can get insertion in log(k) time and removal in constant time (since you're only removing the topmost node). That makes O(k*log(k)) total.

Algorithm for merging two max heaps?

Is there an efficient algorithm for merging 2 max-heaps that are stored as arrays?
It depends on what the type of the heap is.
If it's a standard heap where every node has up to two children and which gets filled up that the leaves are on a maximum of two different rows, you cannot get better than O(n) for merge.
Just put the two arrays together and create a new heap out of them which takes O(n).
For better merging performance, you could use another heap variant like a Fibonacci-Heap which can merge in O(1) amortized.
Update:
Note that it is worse to insert all elements of the first heap one by one to the second heap or vice versa since an insertion takes O(log(n)).
As your comment states, you don't seem to know how the heap is optimally built in the beginning (again for a standard binary heap)
Create an array and put in the elements of both heaps in some arbitrary order
now start at the lowest level. The lowest level contains trivial max-heaps of size 1 so this level is done
move a level up. When the heap condition of one of the "sub-heap"s gets violated, swap the root of the "sub-heap" with it's bigger child. Afterwards, level 2 is done
move to level 3. When the heap condition gets violated, process as before. Swap it down with it's bigger child and process recursively until everything matches up to level 3
...
when you reach the top, you created a new heap in O(n).
I omit a proof here but you can explain this since you have done most of the heap on the bottom levels where you didn't have to swap much content to re-establish the heap condition. You have operated on much smaller "sub heaps" which is much better than what you would do if you would insert every element into one of the heaps => then, you willoperate every time on the whole heap which takes O(n) every time.
Update 2: A binomial heap allows merging in O(log(n)) and would conform to your O(log(n)^2) requirement.
Two binary heaps of sizes n and k can be merged in O(log n * log k) comparisons. See
Jörg-R. Sack and Thomas Strothotte, An algorithm for merging heaps, Acta Informatica 22 (1985), 172-186.
I think what you're looking for in this case is a Binomial Heap.
A binomial heap is a collection of binomial trees, a member of the merge-able heap family. The worst-case running time for a union (merge) on 2+ binomial heaps with n total items in the heaps is O(lg n).
See http://en.wikipedia.org/wiki/Binomial_heap for more information.

Resources