A priority queue supporting LIFO push and pop? - data-structures

I need to design a "priority queue stack" data structure with the following constraints:
pop() and deleteMin() run in O(log(n)) in the average case.
push(x) and getMin() run in O(1) in the average time
Does anyone have suggestions about how to design this?

You can implement this by combining together a standard stack with a priority queue that supports O(1) insertion and O(log n) amortized deletion. For example, you could pair the stack with a Fibonacci heap or skew binomial heap, both of which have these guarantees. Make sure to store pointers lining each stack element with its corresponding priority queue element so that in O(1) time you can jump between the two.
To push an element, push it on the stack and insert it to the priority queue in O(1) time. To read off the minimum, query the priority queue for the minimum value in O(1) time.
To delete the minimum, call extract-min from the priority queue to remove the minimum value, then go to the stack and mark the removed element as invalid. This takes O(1) time. To pop, repeatedly pop the stack until you pop an element that is not marked invalid, then call delete on the priority queue to remove that element. This takes time O(k + log n), where k is the number of pops performed. However, you can show that this is amortized O(1) by using the potential method. If you set the potential of the stack to be the number of invalid arguments, each delete-min increases the potential by one, and each pop operation that pops k invalid elements decreases the potential by k. Therefore, the amortized runtime of a pop is O(log n).
Hope this helps!

You can use Stack with point to Minimum Value Object. So in that case push(x),pop(), getMin() will be in O(1) - in the average time.
But after deleteMin(), you need to adjust the top items.

Related

Efficient HEAPIFY method to reduce number of comparisons

Consider a binary max-heap with n elements. It will have a height of O(log n). When new elements are inserted into the heap, they will be propagated in the heap so that max-heap property is satisfied always.
The new element will be added as the child on the last level. But post insertion, there can be violation of max-heap property. Hence, heapify method will be used. This will have a time complexity of O(log n) i.e height of the heap.
But can we make it even more efficient?
When multiple insert and delete are performed, this procedure makes things slow. Also, it is a strict requirement that the heap should be a max-heap post every insertion.
The objective is to reduce the time complexity of heapify method. This is possible only when the number of comparisons are reduced.
The objective is to reduce the time complexity of the heapify method.
That is a pity, because that is impossible, in contrast to
Reduce the time complexity of multiple inserts and deletes:
Imagine not inserting into the n item heap immediately,
building an auxiliary one (or even a list).
On delete (extract?), place one item from the auxiliary (now at size k) "in the spot emptied" and do a sift-down or up as required if k << n.
If the auxiliary data structure is not significantly smaller than the main one, merge them.
Such ponderings lead to advanced heaps like Fibonacci, pairing, Brodal…
The time complexity of the insert operation in a heap is dependent on the number of comparisons that are made. One can imagine to use some overhead to implement a smart binary search along the leaf-to-root path.
However, the time complexity is not only determined by the number of comparisons. Time complexity is determined by any work that must be performed, and in this case the number of writes is also O(log𝑛) and that number of writes cannot be reduced.
The number of nodes whose value need to change by the insert operation is O(log𝑛). A reduction of the number of comparisons is not enough to reduce the complexity.

What operations would you use for implementing a priority queue PQ with enqueue and dequeue?

Assume that you are implementing a priority queue PQ that returns the max element on dequeue operation.
If we use a max heap to implement the PQ, enqueue is O(______) operation, and dequeue is O(_____) operation
Could someone please answer/explain how you got it...I am thinking log n for both but not sure?
Think of how a binary heap works.
When you insert an item, you add it as the last node of the heap and then sift it up into its proper place. Since a heap that contains n items has a height of log(n), and you might have to sift the item all the way to the top, the worst case is O(log n).
When you remove an item, you replace the root note with the last node in the heap, and then you sift it down. Worst case, you'll have to sift it all the way back down to the bottom of the heap: a move of log(n) levels. Therefore, O(log n).

Advantages of a Binary Heap for a Priority Queue?

It seems I'm missing something very simple: what are advantages of a Binary Heap for a Priority Queue comparing, say, with quick-sorted array of values? In both cases we keep values in an array, insert is O(logN), delete-max is O(1) in both cases. Initial construction out of a given array of elements is O(NlogN) in both cases, though the link http://en.wikipedia.org/wiki/Heap_%28data_structure%29 suggests faster Floyd's algorithm for the Binary Heap construction. But in case of a queue the elements are probably received one by one, so this advantage disappears. Also, merge seems to perform better for a Binary Heap.
So what are the reasons to prefer BH besides merge? Maybe my assumption is wrong, and BP is used only for studying purpose. I checked C++ docs, they mention "a heap" but of course it does not necessary means Binary heap.
Somewhat similar question: When is it a bad idea to use a heap for a Priority Queue?
The major advantage of the binary heap is that you can add new values to it efficiently after initially constructing it. Suppose you want to back a priority queue with a sorted array. If all the values in the queue are known in advance, you can just sort the values, as you've mentioned. But what happens when you the want to add a new value to the priority queue? This might take time Θ(n) in the worst case because you'd have to shift down all the array elements to make space for the new element that you just added. On the other hand, insertion into a binary heap takes time O(log n), which is exponentially faster.
Another reason you'd use a heap over a sorted array is if you only need to dequeue a few elements. As you mentioned, sorting an array takes time O(n log n), but using clever algorithms you can build a heap in time O(n). If you need to build a priority queue and residue k elements from it, where k is unknown in advance, the runtime with a sorted array is O(n log n + k) and with a binary heap is O(n + k log n). For small k, the second algorithm is much faster.

What is the best way to implement a double-ended priority queue?

I would like to implement a double-ended priority queue with the following constraints:
needs to be implemented in a fixed size array..say 100 elements..if new elements need to be added after the array is full, the oldest needs to be removed
need maximum and minimum in O(1)
if possible insert in O(1)
if possible remove minimum in O(1)
clear to empty/init state in O(1) if possible
count of number of elements in array at the moment in O(1)
I would like O(1) for all the above 5 operations but its not possible to have O(1) on all of them in the same implementation. Atleast O(1) on 3 operations and O(log(n)) on the other 2 operations should suffice.
Will appreciate if any pointers can be provided to such an implementation.
There are many specialized data structures for this. One simple data structure is the min-max heap, which is implemented as a binary heap where the layers alternate between "min layers" (each node is less than or equal to its descendants) and "max layers" (each node is greater than or equal to its descendants.) The minimum and maximum can be found in time O(1), and, as in a standard binary heap, enqueues and dequeues can be done in time O(log n) time each.
You can also use the interval heap data structure, which is another specialized priority queue for the task.
Alternatively, you can use two priority queues - one storing elements in ascending order and one in descending order. Whenever you insert a value, you can then insert elements into both priority queues and have each store a pointer to the other. Then, whenever you dequeue the min or max, you can remove the corresponding element from the other heap.
As yet another option, you could use a balanced binary search tree to store the elements. The minimum and maximum can then be found in time O(log n) (or O(1) if you cache the results) and insertions and deletions can be done in time O(log n). If you're using C++, you can just use std::map for this and then use begin() and rbegin() to get the minimum and maximum values, respectively.
Hope this helps!
A binary heap will give you insert and remove minimum in O(log n) and the others in O(1).
The only tricky part is removing the oldest element once the array is full. For this, keep another array:
time[i] = at what position in the heap array is the element
added at time i + 100 * k.
Every 100 iterations, you increment k.
Then, when the array fills up for the first time, you remove heap[ time[0] ], when it fills up for the second time you remove heap[ time[1] ], ..., when it fills up for the 100th time, you wrap around and remove heap[ time[0] ] again etc. When it fills up for the kth time, you remove heap[ time[k % 100] ] (100 is your array size).
Make sure to also update the time array when you insert and remove elements.
Removal of an arbitrary element can be done in O(log n) if you know its position: just swap it with the last element in your heap array, and sift down the element you have swapped in.
If you absolutely need max and min to be O(1) then what you can do is create a linked list, where you constantly keep track of min, max, and size, and then link all the nodes to some sort of tree structure, probably a heap. Min, max, and size would all be constant, and since finding any node would be in O(log n), insert and remove are log n each. Clearing would be trivial.
If your queue is a fixed size, then O-notation is meaningless. Any O(log n) or even O(n) operation is essentially O(1) because n is fixed, so what you really want is an algorithm that's fast for the given dataset. Probably two parallel traditional heap priority queues would be fine (one for high, one for low).
If you know more about what kind of data you have, you might be able to make something more special-purpose.

Priority queue O(1) insertion and removal

Is it possible for a priority queue to have both O(1) insertion and removal?
Priority queues can be implemented using heaps and looking at the run times for Fibonacci heaps it appears that it is not possible to get a run time better than O(logN) per removal.
I am trying to implement a data structure where given N items I will have half in a max-priority queue and half in a min-priority queue. I am then to remove all N items sequentially.
I can insert all N elements in O(N) time but removing all N items will take O(N*logN) so I am wondering if another approach would be more suitable.
If you could construct a priority queue with O(1) insertion and O(1) removal, you could use that to sort a list of n items in O(n) time. As explained in this answer, you can't sort in O(n) in the general case, so it will be impossible to construct a priory queue with O(1) insertion and O(1) removal without making more assumptions on the input.
For example, a priority queue that has O(1) insertion and O(k) (k is the maximum element that could be inserted) removal can be constructed. Keep a table of k linked lists. Insertion of x just prepends an item to the front of the xth list. Removal has to scan through the table to find the first non-empty list (then remove the first item of the list and return the index of that list). There are only k lists, so removal takes O(k) time. If k is a constant, that works out to O(1) removal.
In practice, using a table of counts would work out better. Incrementing a variable-length integer isn't constant time unless you use amortized analysis (which is why I didn't use it in the previous paragraph), but in practice you wouldn't need variable-length counts anyway. Also, in practice it would be bad for large k, even if k is a constant - you'd run out of memory quickly and scanning for the first non-zero element could take a while.

Resources