Delete and Increase key for Binomial heap - data-structures

I currently studying the binomial heap right now.
I learned that following operations for the binomial heaps can be completed in Theta(log n) time.:
Get-max
Insert
Extract Max
Merge
Increase-Key
Delete
But, the two operations Increase key and Delete operations said they need the pointer to the element that need to be complete in Theta(log n).
Here is 3 questions I want to ask:
Is this because if Increase key and Delete don't have the pointer to element, they have to search the elements before the operations took place?
what is the time complexity for the searching operations for the binomial heap? (I believe O(n))
If the pointer to the element is not given for Increase key and Delete operations, those two operations will take O(n) time or it can be lower than that.

It’s good that you’re thinking about this!
Yes, that’s exactly right. The nodes in a binomial heap are organized in a way that makes it very quick to find the minimum value, but the relative ordering of the remaining elements is not guaranteed to be in an order that makes it easy to find things.
There isn’t a general way to search a binomial heap for an element faster than O(n). Or, stated differently, the worst-case cost of any way of searching a binomial heap is Ω(n). Here’s one way to see this. Form a binomial heap where n-1 items have priority 137 and one item has priority 42. The item with priority 42 must be a leaf node. There are (roughly) n/2 leaves in the heap, and since there is no ordering on them to find that one item you’d have to potentially look at all the leaves. To formalize this, you could form multiple different binomial heaps with these items, and whatever algorithm was looking for the item of priority 42 would necessarily have to find it in the last place it looks at least once.
For the reasons given above, no, there’s no way to implement those operations quickly without having pointers to them, since in the worst case you have to search everywhere.

Related

Algorithms which proceed in a manner most similar to a PQ-based sort

So I'm attempting to figure out algorithms which proceed in a manner most similar to a PQ-based sort for the following structures.
1-heap
3-heap
n-1 heap
BST
Balanced BST
For an example with heapsort and d-heaps. Heapsort uses a 2-heap as an intermediate representation to sort the contents. For heapsort, the PQ is a 2-heap even though any PQ would work.
I'm not sure what you mean by a 1-heap. Using standard terminology, a 1-heap would be a heap with one child per node: a linked list. That wouldn't perform very well.
A 3-heap is just a d-ary heap where d=3. You say that heap sort uses a 2-heap. That's not always true. Many people implement heap sort with a 2-heap, often because they don't know that they could use a 3-heap. Which is unfortunate, because a 3-heap can be more efficient in many cases. I and many others have implemented heap sort with a 3-heap.
If by "n-1 heap" you mean a heap with n-1 children of the root, that's not going to be very efficient. With n-1 children of the root, when you remove the smallest item, you have to search all n-1 children to find the new root. You might as well just repeatedly search a linear list for the next largest item. An "n-1" heap sort will have the same performance as a selection sort.
A balanced binary search tree will perform better than an unbalanced one, but it's expensive to build the tree. The beauty of heap sort is that you can build the heap in O(n), in place. Building a BST is O(n log n). Although removing the smallest node is an O(log n) operation for both, real world performance favors the heap.
All that said, any data structure that can serve as a priority queue can be used in a pq-based sort. Binary heap (more generally, d-ary heap) is commonly used because it's easy to implement and very efficient. Depending on your application, though, something like pairing heap could potentially outperform binary heap.

Data structure design with O(log n) amortized time?

I'm working on this problem but I'm pretty confused on how to solve it:
Design a data structure that supports the following operations in amortized O(log n) time, where n is the total number of elements:
Ins(k): Insert a new element with key k
Extract-Max: Find and remove the element with largest key
Extract-Min: Find and remove the element with smallest key
Union: Merge two different sets of elements
How do I calculate the amortized time? Isn't this already something like a hash table? Or is it a variant of it?
I would really appreciate if someone can help me with this.
Thank you!!
What you're proposing isn't something that most hash tables are equipped to deal with because hash tables don't usually support finding the min and max elements quickly while supporting deletions.
However, this is something that you could do with a pair of priority queues that support melding. For example, suppose that you back your data structure with two binomial heaps - a min-heap and a max-heap. Every time you insert an element into your data structure, you add it to both the min-heap and the max-heap. However, you slightly modify the two heaps so that each element in the heap stores a pointer to its corresponding element in the other heap; that way, given a node in the min-heap, you can find the corresponding node in the max-heap and vice-versa.
Now, to do an extract-min or extract-max, you just apply a find-min operation to the min-heap or a find-max operation to the max-heap to get the result. Then, delete that element from both heaps using the normal binomial heap delete operation. (You can use the pointer you set up during the insert step to quickly locate the sibling element in the other heap).
Finally, for a union operation, just apply the normal binomial heap merge operation to the corresponding min-heaps and max-heaps.
Since all of the described operations requires O(1) operations on binomial heaps, each of them runs in time O(log n) worst-case, with no amortization needed.
Generally speaking, the data structure you're describing is called a double-ended priority queue. There are a couple of specialized data structures you can use to meet those requirements, though the one described above is probably the easiest to build with off-the-shelf components.

Best algorithm/data structure for a continually updated priority queue

I need to frequently find the minimum value object in a set that's being continually updated. I need to have a priority queue type of functionality. What's the best algorithm or data structure to do this? I was thinking of having a sorted tree/heap, and every time the value of an object is updated, I can remove the object, and re-insert it into the tree/heap. Is there a better way to accomplish this?
A binary heap is hard to beat for simplicity, but it has the disadvantage that decrease-key takes O(n) time. I know, the standard references say that it's O(log n), but first you have to find the item. That's O(n) for a standard binary heap.
By the way, if you do decide to use a binary heap, changing an item's priority doesn't require a remove and re-insert. You can change the item's priority in-place and then either bubble it up or sift it down as required.
If the performance of decrease-key is important, a good alternative is a pairing heap, which is theoretically slower than a Fibonacci heap, but is much easier to implement and in practice is faster than the Fibonacci heap due to lower constant factors. In practice, pairing heap compares favorably with binary heap, and outperforms binary heap if you do a lot of decrease-key operations.
You could also marry a binary heap and a dictionary or hash map, and keep the dictionary updated with the position of the item in the heap. This gives you faster decrease-key at the cost of more memory and increased constant factors for the other operations.
Quoting Wikipedia:
To improve performance, priority queues typically use a heap as their
backbone, giving O(log n) performance for inserts and removals, and
O(n) to build initially. Alternatively, when a self-balancing binary
search tree is used, insertion and removal also take O(log n) time,
although building trees from existing sequences of elements takes O(n
log n) time; this is typical where one might already have access to
these data structures, such as with third-party or standard libraries.
If you are looking for a better way, there must be something special about the objects in your priority queue. For example, if the keys are numbers from 1 to 10, a countsort-based approach may outperform the usual ones.
If your application looks anything like repeatedly choosing the next scheduled event in a discrete event simulation, you might consider the options listed in e.g. http://en.wikipedia.org/wiki/Discrete_event_simulation and http://www.acm-sigsim-mskr.org/Courseware/Fujimoto/Slides/FujimotoSlides-03-FutureEventList.pdf. The later summarizes results from different implementations in this domain, including many of the options considered in other comments and answers - and a search will find a number of papers in this area. Priority queue overhead really does make some difference in how many times real time you can get your simulation to run - and if you wish to simulate something that takes weeks of real time this can be important.

Can we use binary search tree to simulate heap operation?

I was wondering if we can use a binary search tree to simulate heap operations (insert, find minimum, delete minimum), i.e., use a BST for doing the same job?
Are there any kind of benefits for doing so?
Sure we can. but with a balanced BST.
The minimum is the leftest element. The maximum is the rightest element. finding those elements is O(logn) each, and can be cached on each insert/delete, after the data structure was modified [note there is room for optimizations here, but this naive approach also doesn't contradict complexity requirement!]
This way you get insert,delete: O(logn), findMin/findMax: O(1)
EDIT:
The only advantage I can think of in this implementtion is that you get both findMin,findMax in one data structure.
However, this solution will be much slower [more ops per step, more cache misses are expected...] and consume more space then the regular array-based implementation of a heap.
Yes, but you lose the O(1) average insert of the heap
As others mentioned, you can use a BST to simulate a heap.
However this has one major downside: you lose the O(1) insert average time, which is basically the only reason to use the heap in the first place: https://stackoverflow.com/a/29548834/895245
If you want to track both min and max on a heap, I recommend that you do it with two heaps instead of a BST to keep the O(1) insert advantage.
Yes, we can, by simply inserting and finding the minimum into the BST. There are few benefits, however, since a lookup will take O(log n) time and other functions receive similar penalties due to the stricter ordering enforced throughout the tree.
Basically, I agree with #amit answer. I will elaborate more on the implementation of this modified BST.
Heap can do findMin or findMax in O(1) but not both in the same data structure. With a slight modification, the BST can do both findMin and findMax in O(1).
In this modified BST, you keep track of the the min node and max node every time you do an operation that can potentially modify the data structure. For example in insert operation you can check if the min value is larger than the newly inserted value, then assign the min value to the newly added node. The same technique can be applied on the max value. Hence, this BST contain these information which you can retrieve them in O(1). (same as binary heap)
In this BST (specifically Balanced BST), when you pop min or pop max, the next min value to be assigned is the successor of the min node, whereas the next max value to be assigned is the predecessor of the max node. Thus it perform in O(1). Thanks to #JimMischel comment below however we need to re-balance the tree, thus it will still run O(log n). (same as binary heap)
In my opinion, generally Heap can be replaced by Balanced BST because BST perform better in almost all of the heap data structure can do. However, I am not sure if Heap should be considered as an obsolete data structure. (What do you think?)
PS: Have to cross reference to different questions: https://stackoverflow.com/a/27074221/764592

Algorithm for merging two max heaps?

Is there an efficient algorithm for merging 2 max-heaps that are stored as arrays?
It depends on what the type of the heap is.
If it's a standard heap where every node has up to two children and which gets filled up that the leaves are on a maximum of two different rows, you cannot get better than O(n) for merge.
Just put the two arrays together and create a new heap out of them which takes O(n).
For better merging performance, you could use another heap variant like a Fibonacci-Heap which can merge in O(1) amortized.
Update:
Note that it is worse to insert all elements of the first heap one by one to the second heap or vice versa since an insertion takes O(log(n)).
As your comment states, you don't seem to know how the heap is optimally built in the beginning (again for a standard binary heap)
Create an array and put in the elements of both heaps in some arbitrary order
now start at the lowest level. The lowest level contains trivial max-heaps of size 1 so this level is done
move a level up. When the heap condition of one of the "sub-heap"s gets violated, swap the root of the "sub-heap" with it's bigger child. Afterwards, level 2 is done
move to level 3. When the heap condition gets violated, process as before. Swap it down with it's bigger child and process recursively until everything matches up to level 3
...
when you reach the top, you created a new heap in O(n).
I omit a proof here but you can explain this since you have done most of the heap on the bottom levels where you didn't have to swap much content to re-establish the heap condition. You have operated on much smaller "sub heaps" which is much better than what you would do if you would insert every element into one of the heaps => then, you willoperate every time on the whole heap which takes O(n) every time.
Update 2: A binomial heap allows merging in O(log(n)) and would conform to your O(log(n)^2) requirement.
Two binary heaps of sizes n and k can be merged in O(log n * log k) comparisons. See
Jörg-R. Sack and Thomas Strothotte, An algorithm for merging heaps, Acta Informatica 22 (1985), 172-186.
I think what you're looking for in this case is a Binomial Heap.
A binomial heap is a collection of binomial trees, a member of the merge-able heap family. The worst-case running time for a union (merge) on 2+ binomial heaps with n total items in the heaps is O(lg n).
See http://en.wikipedia.org/wiki/Binomial_heap for more information.

Resources