Is there an efficient algorithm for merging 2 max-heaps that are stored as arrays?
It depends on what the type of the heap is.
If it's a standard heap where every node has up to two children and which gets filled up that the leaves are on a maximum of two different rows, you cannot get better than O(n) for merge.
Just put the two arrays together and create a new heap out of them which takes O(n).
For better merging performance, you could use another heap variant like a Fibonacci-Heap which can merge in O(1) amortized.
Update:
Note that it is worse to insert all elements of the first heap one by one to the second heap or vice versa since an insertion takes O(log(n)).
As your comment states, you don't seem to know how the heap is optimally built in the beginning (again for a standard binary heap)
Create an array and put in the elements of both heaps in some arbitrary order
now start at the lowest level. The lowest level contains trivial max-heaps of size 1 so this level is done
move a level up. When the heap condition of one of the "sub-heap"s gets violated, swap the root of the "sub-heap" with it's bigger child. Afterwards, level 2 is done
move to level 3. When the heap condition gets violated, process as before. Swap it down with it's bigger child and process recursively until everything matches up to level 3
...
when you reach the top, you created a new heap in O(n).
I omit a proof here but you can explain this since you have done most of the heap on the bottom levels where you didn't have to swap much content to re-establish the heap condition. You have operated on much smaller "sub heaps" which is much better than what you would do if you would insert every element into one of the heaps => then, you willoperate every time on the whole heap which takes O(n) every time.
Update 2: A binomial heap allows merging in O(log(n)) and would conform to your O(log(n)^2) requirement.
Two binary heaps of sizes n and k can be merged in O(log n * log k) comparisons. See
Jörg-R. Sack and Thomas Strothotte, An algorithm for merging heaps, Acta Informatica 22 (1985), 172-186.
I think what you're looking for in this case is a Binomial Heap.
A binomial heap is a collection of binomial trees, a member of the merge-able heap family. The worst-case running time for a union (merge) on 2+ binomial heaps with n total items in the heaps is O(lg n).
See http://en.wikipedia.org/wiki/Binomial_heap for more information.
Related
Consider a binary max-heap with n elements. It will have a height of O(log n). When new elements are inserted into the heap, they will be propagated in the heap so that max-heap property is satisfied always.
The new element will be added as the child on the last level. But post insertion, there can be violation of max-heap property. Hence, heapify method will be used. This will have a time complexity of O(log n) i.e height of the heap.
But can we make it even more efficient?
When multiple insert and delete are performed, this procedure makes things slow. Also, it is a strict requirement that the heap should be a max-heap post every insertion.
The objective is to reduce the time complexity of heapify method. This is possible only when the number of comparisons are reduced.
The objective is to reduce the time complexity of the heapify method.
That is a pity, because that is impossible, in contrast to
Reduce the time complexity of multiple inserts and deletes:
Imagine not inserting into the n item heap immediately,
building an auxiliary one (or even a list).
On delete (extract?), place one item from the auxiliary (now at size k) "in the spot emptied" and do a sift-down or up as required if k << n.
If the auxiliary data structure is not significantly smaller than the main one, merge them.
Such ponderings lead to advanced heaps like Fibonacci, pairing, Brodal…
The time complexity of the insert operation in a heap is dependent on the number of comparisons that are made. One can imagine to use some overhead to implement a smart binary search along the leaf-to-root path.
However, the time complexity is not only determined by the number of comparisons. Time complexity is determined by any work that must be performed, and in this case the number of writes is also O(log𝑛) and that number of writes cannot be reduced.
The number of nodes whose value need to change by the insert operation is O(log𝑛). A reduction of the number of comparisons is not enough to reduce the complexity.
I currently studying the binomial heap right now.
I learned that following operations for the binomial heaps can be completed in Theta(log n) time.:
Get-max
Insert
Extract Max
Merge
Increase-Key
Delete
But, the two operations Increase key and Delete operations said they need the pointer to the element that need to be complete in Theta(log n).
Here is 3 questions I want to ask:
Is this because if Increase key and Delete don't have the pointer to element, they have to search the elements before the operations took place?
what is the time complexity for the searching operations for the binomial heap? (I believe O(n))
If the pointer to the element is not given for Increase key and Delete operations, those two operations will take O(n) time or it can be lower than that.
It’s good that you’re thinking about this!
Yes, that’s exactly right. The nodes in a binomial heap are organized in a way that makes it very quick to find the minimum value, but the relative ordering of the remaining elements is not guaranteed to be in an order that makes it easy to find things.
There isn’t a general way to search a binomial heap for an element faster than O(n). Or, stated differently, the worst-case cost of any way of searching a binomial heap is Ω(n). Here’s one way to see this. Form a binomial heap where n-1 items have priority 137 and one item has priority 42. The item with priority 42 must be a leaf node. There are (roughly) n/2 leaves in the heap, and since there is no ordering on them to find that one item you’d have to potentially look at all the leaves. To formalize this, you could form multiple different binomial heaps with these items, and whatever algorithm was looking for the item of priority 42 would necessarily have to find it in the last place it looks at least once.
For the reasons given above, no, there’s no way to implement those operations quickly without having pointers to them, since in the worst case you have to search everywhere.
How is the bottom up approach of heap construction of the order O(n) ? Anany Levitin says in his book that this is more efficient compared to top down approach which is of order O(log n). Why?
That to me seems like a typo.
There are two standard algorithms for building a heap. The first is to start with an empty heap and to repeatedly insert elements into it one at a time. Each individual insertion takes time O(log n), so we can upper-bound the cost of this style of heap-building at O(n log n). It turns out that, in the worst case, the runtime is Θ(n log n), which happens if you insert the elements in reverse-sorted order.
The other approach is the heapify algorithm, which builds the heap directly by starting with each element in its own binary heap and progressively coalescing them together. This algorithm runs in time O(n) regardless of the input.
The reason why the first algorithm requires time Θ(n log n) is that, if you look at the second half of the elements being inserted, you'll see that each of them is inserted into a heap whose height is Θ(log n), so the cost of doing each bubble-up can be high. Since there are n / 2 elements and each of them might take time Θ(log n) to insert, the worst-case runtime is Θ(n log n).
On the other hand, the heapify algorithm spends the majority of its time working on small heaps. Half the elements are inserted into heaps of height 0, a quarter into heaps of height 1, an eighth into heaps of height 2, etc. This means that the bulk of the work is spent inserting elements into small heaps, which is significantly faster.
If you consider swapping to be your basic operation -
In top down construction,the tree is constructed first and a heapify function is called on the nodes.The worst case would swap log n times ( to sift the element to the top of the tree where height of tree is log n) for all the n/2 leaf nodes. This results in a O(n log n) upper bound.
In bottom up construction, you assume all the leaf nodes to be in order in the first pass, so heapify is now called only on n/2 nodes. At each level, the number of possible swaps increases but the number of nodes on which it happens decreases.
For example -
At the level right above leaf nodes,
we have n/4 nodes that can have at most 1 swap each.
At its' parent level we have,
n/8 nodes that can have at most 2 swaps each and so on.
On summation, we'll come up with a O(n) efficiency for bottom up construction of a heap.
It generally refers to a way of solving a problem. Especially in computer science algorithms.
Top down :
Take the whole problem and split it into two or more parts.
Find solution to these parts.
If these parts turn out to be too big to be solved as a whole, split them further and find find solutions to those sub-parts.
Merge solutions according to the sub-problem hierarchy thus created after all parts have been successfully solved.
In the regular heapify(), we perform two comparisons on each node from top to bottom to find the largest of three elements:
Parent node with left child
The larger node from the first comparison with the second child
Bottom up :
Breaking the problem into smallest possible(and practical) parts.
Finding solutions to these small sub-problems.
Merging the solutions you get iteratively(again and again) till you have merged all of them to get the final solution to the "big" problem. The main difference in approach is splitting versus merging. You either start big and split "down" as required or start with the smallest and merge your way "up" to the final solution.
Bottom-up Heapsort, on the other hand, only compares the two children and follows the larger child to the end of the tree ("top-down"). From there, the algorithm goes back towards the tree root (“bottom-up”) and searches for the first element larger than the root. From this position, all elements are moved one position towards the root, and the root element is placed in the field that has become free.
Binary Heap can be built in two ways:
Top-Down Approach
Bottom-Up Approach
In the Top-Down Approach, first begin with 3 elements. You consider 2 of them as heaps and the third as a key k. You then create a new Heap by joining these two sub-heaps with the key as the root node. Then, you perform Heapify to maintain the heap order (either Min or Max Heap order).
The, we take two such heaps(containing 3 elements each) and another element as a k, and create a new heap. We keep repeating this process, and increasing the size of each sub-heap until all elements are added.
This process adds half the elements in the bottom level, 1/4th in the second last one, 1/8th in the third last one and so on, therefore, the complexity of this approach results in a nearly observed time of O(n).
In the bottom up approach, we first simply create a complete binary tree from the given elements. We then apply DownHeap operation on each parent of the tree, starting from the last parent and going up the tree until the root. This is a much simpler approach. However, as DownHeap's worst case is O(logn) and we will be applying it on n/2 elements of the tree; the time complexity of this particular method results in O(nlogn).
Regards.
I want to Find Maximum number of comparison when convert min-heap to max-heap with n node. i think convert min-heap to max-heap with O(n). it means there is no way and re-create the heap.
As a crude lower bound, given a tree with the (min- or max-) heap property, we have no prior idea about how the values at the leaves compare to one another. In a max heap, the values at the leaves all may be less than all values at the interior nodes. If the heap has the topology of a complete binary tree, then even finding the min requires at least roughly n/2 comparisons, where n is the number of tree nodes.
If you have a min-heap of known size then you can create a binary max-heap of its elements by filling an array from back to front with the values obtained by iteratively deleting the root node from the min-heap until it is exhausted. Under some circumstances this can even be done in place. Using the rule that the root node is element 0 and the children of node i are elements 2i and 2i+1, the (max-) heap condition will automatically be satisfied for the heap represented by the new array.
Each deletion from a min-heap of size m requires up to log(m) element comparisons to restore the heap condition, however. I think that adds up to O(n log n) comparisons for the whole job. I am doubtful that you can do it any with any lower complexity without adding conditions. In particular, if you do not perform genuine heap deletions (incurring the cost of restoring the heap condition), then I think you incur comparable additional costs to ensure that you end up with a heap in the end.
Consider two min-heaps H1,H2 of size n1 and n2 respectively, such that every node in H2 is greater than every node of H1.
How can I merge this two heaps into one heap "H" , in O(n2) (not O(n^2)..)?
(Assume that the heaps represented in arrays of size > n1+n2)
A heap can be constucted in linear time see here. This means you need only take all elements and construct a heap from all the elements to get linear complexity. However you can use a "more fancy" heap for instance leftist heap and perform the merge operation even faster.