Merging two Max heap which are complete Binary tree - algorithm

Let H1 and H2 be two complete binary trees that are heaps as well. Assume H1 and H2 are max-heaps, each of size n .Design and analyse an efficient algorithm to merge H1 and H2 to a new max-heap H of size 2n.
==========================================================================
Approach - First copy the two arrays of H1 and H2 into a new array of size 2n...then apply build heap operation to get H...Time complexity=O(2n)=O(n), but doesn't need we need to apply Max heapify after building the heap? So, where's O(logn) time considered for that.
===================================================================
Another approach says merging two max heaps takes O(n+m) time. Now, what's correct and why no one is caring for Max Heapify?

MaxHeapify operation takes O(logn) time.
In Build Heap Operation, we need to call MaxHeapify n times. Therefore it seems that total complexity of build heap operation is O(nlogn)
But it is not correct. Actually build heap operation takes only O(n) time.You can refer this link to get idea about it.
https://www.geeksforgeeks.org/time-complexity-of-building-a-heap/
Hence it takes O(2n) => O(n) time complexity to build new heap H of size 2n.
If you consider two max heaps of size m and n, it takes O(m+n) time complexity to build new heap of size m+n.

Related

Finding k smallest elements in a min heap - worst-case complexity

I have a minimum heap with n elements and want to find k smallest numbers from this heap. What is the worst-case complexity?
Here is my approach: somewhere on the stackoverflow I read that complexity of finding i-th smallest number in a min heap is O(i). So if we would like to find n-1 smallest numbers (n is pointless, since it would be the entire heap), the total complexity would look something like this:
O(n-1)+O(n-2)+O(n-3)+…+O(2)+O(1)=O((1+n-1)*(n/2))=O(n^2)
Is this correct?
No, the time is much better than that. O(k log(n)) very easily, and O(k) if you're smart.
Finding and removing the smallest element from the heap is O(log(n)). This leads to O(k log(n)) time very easily.
BUT the result that you are thinking about is https://ac.els-cdn.com/S0890540183710308/1-s2.0-S0890540183710308-main.pdf?_tid=382a1cac-e4f7-11e7-8ac9-00000aab0f02&acdnat=1513713791_08f4df78a8821855e8ec788da063ea2f that shows how to find the size of the kth smallest number in time O(k). Now you use the fact that a heap is a binary tree and start from the root and do a recursive search for every number that you find which is smaller than that largest. Then fill out the rest of your list with copies of the k'th smallest number.
In that search you will wind up looking at up to k-1 that are at most that size, and for some of them you will look at up to 2 children that are too large to bother with, for a maximum of 3k-3 elements. This makes the whole algorithm O(k).
That link died due to bitrot. Hopefully https://www.sciencedirect.com/science/article/pii/S0890540183710308 lasts longer.
I am doubtful that it is possible to identify the kth smallest element in time O(k). The best I have seen before is an O(k log k) algorithm, which also conveniently solves your problem by identifying the k smallest elements. You can read the details in another answer on StackOverflow or on Quora.
The basic idea is to manipulate a secondary heap. Initially, this secondary heap contains only the root of the original heap. At each step, the algorithm deletes the min of the secondary heap and inserts its two original children (that is, its children from the original heap) into the secondary heap.
This algorithm has a nice property that on step i, the element it deletes from the secondary heap is the ith smallest element overall. So after k steps, the set of items which have been deleted from the secondary heap are exactly the k smallest elements. This algorithm is O(k log k) because there are O(k) deletions/insertions into a secondary heap which is upper bounded in size by O(k).
EDIT: I stand corrected! btilly's answer provides a solution in O(k) using a result from this paper.
There is a recent (2019) algorithm that finds the k smallest elements of a binary min-heap in time O(k) that uses the soft heap data structure. This is a dramatically simpler algorithm than Frederickson’s original O(k)-time heap selection algorithm. See “ Selection from Heaps, Row-Sorted Matrices, and X+Y Using Soft Heaps” by Kaplan et al.

Merge two heap trees in log(n+m)

How do I merge two heap trees (the kind of the trees is not defined), when the size of one is n and of the other is m in O(log(n+m))?
The question is indeed absurd. Even copying a single heap into a new heap tree consumes O(N log N) time. How can you merge two such trees in only log(N) time complexity. Even in the best case, when the two heaps are identical, the time required to copy the nodes to a new heap tree takes O(N) time atleast.

Merge two minimum-heaps to one heap in efficiency?

Consider two min-heaps H1,H2 of size n1 and n2 respectively, such that every node in H2 is greater than every node of H1.
How can I merge this two heaps into one heap "H" , in O(n2) (not O(n^2)..)?
(Assume that the heaps represented in arrays of size > n1+n2)
A heap can be constucted in linear time see here. This means you need only take all elements and construct a heap from all the elements to get linear complexity. However you can use a "more fancy" heap for instance leftist heap and perform the merge operation even faster.

Analysis of speed and memory for heapsort

I tried googling and wiki'ing these questions but can't seem to find concrete answers. Most of what I found involved using proofs with the master theorem, but I'm hoping for something in plain English that can be more intuitively remembered. Also I am not in school and these questions are for interviewing.
MEMORY:
What exactly does it mean to determine big-o in terms of memory usage? For example, why is heapsort considered to run with O(1) memory when you have to store all n items? Is it because you are creating only one structure for the heap? Or is it because you know its size and so you can create it on the stack, which is always constant memory usage?
SPEED:
How is the creation of the heap done in O(n) time if adding elements is done in O(1) but percolating is done in O(logn)? Wouldn't that mean you do n inserts at O(1) making it O(n) and percolating after each insert is O(logn). So O(n) * O(logn) = O(nlogn) in total. I also noticed most implementations of heap sort use a heapify function instead of percolating to create the heap? Since heapify does n comparisons at O(logn) that would be O(nlogn) and with n inserts at O(1) we would get O(n) + O(nlogn) = O(nlogn)? Wouldn't the first approach yield better performance than the second with small n?
I kind of assumed this above, but is it true that doing an O(1) operation n times would result in O(n) time? Or does n * O(1) = O(1)?
So I found some useful info about building a binary heap from wikipedia: http://en.wikipedia.org/wiki/Binary_heap#Building_a_heap.
I think my main source of confusion was how "inserting" into a heap is both O(1) and O(logn), even though the first shouldn't be called an insertion and maybe just a build step or something. So you wouldn't use heapify anymore after you've already created your heap, instead you'd use the O(logn) insertion method.
The method of adding items iteratively while maintaining the heap property runs in O(nlogn) and creating the heap without respecting the heap property, and then heapifying, actually runs in O(n), the reason which isn't very intuitive and requires a proof, so I was wrong about that.
The removal step to get the ordered items is the same cost, O(nlogn), after each method has a heap that respects the heap property.
So in the end you'd have either an O(1) + O(n) + O(nlogn) = O(nlogn) for the build heap method, and an O(nlogn) + O(nlogn) = O(nlogn) for the insertion method. Obviously the first is preferable, especially for small n.

Is there one type of set-like data structure supporting merging in O(logn) time and k-th search in O(logn) time?(n is the size of this set)

Is there one type of set-like data structure supporting merging in O(logn) time and k-th element search in O(logn) time? n is the size of this set.
You might try a Fibonacci heap which does merge in constant amortized time and decrease key in constant amortized time. Most of the time, such a heap is used for operations where you are repeatedly pulling the minimum value, so a check-for-membership function isn't implemented. However, it is simple enough to add one using the decrease key logic, and simply removing the decrease portion.
If k is a constant, then any meldable heap will do this, including leftist heaps, skew heaps, pairing heaps and Fibonacci heaps. Both merging and getting the first element in these structures typically take O(1) or O(lg n) amortized time, so O( k lg n) maximum.
Note, however, that getting to the k'th element may be destructive in the sense that the first k-1 items may have to be removed from the heap.
If you're willing to accept amortization, you could achieve the desired bounds of O(lg n) time for both meld and search by using a binary search tree to represent each set. Melding two trees of size m and n together requires time O(m log(n / m)) where m < n. If you use amortized analysis and charge the cost of the merge to the elements of the smaller set, at most O(lg n) is charged to each element over the course of all of the operations. Selecting the kth element of each set takes O(lg n) time as well.
I think you could also use a collection of sorted arrays to represent each set, but the amortization argument is a little trickier.
As stated in the other answers, you can use heaps, but getting O(lg n) for both meld and select requires some work.
Finger trees can do this and some more operations:
http://en.wikipedia.org/wiki/Finger_tree
There may be something even better if you are not restricted to purely functional data structures (i.e. aka "persistent", where by this is meant not "backed up on non-volatile disk storage", but "all previous 'versions' of the data structure are available even after 'adding' additional elements").

Resources