Unsorted Priority Queue - data-structures

I've been misguided a bit and so am a bit confused.This is what I have understood as an unsorted priority queue.Can someone please confirm?
An unsorted priority queue is one in which we insert at the end and remove elements based on priority(i,e; smallest value in the queue).
Thank you.

the basic queue in data structure works based on first come first serve (FIFO) first in first out,
the first element was inserted in the queue the one will be served or executed in the queue, and the last element will be served or executed last, this method represent unsorted queue.
for the sorted queue it will sort the inserted elements then execute them as your sorting method
if you added some kind of sorting algorithm attached to the queue it will work as the algorithm work.
for more info :
http://en.wikipedia.org/wiki/Priority_queue
http://en.wikipedia.org/wiki/Sorting_algorithm

The priority queue is an abstract data structure that defines the method get-min, push and pop-min, possibly also union. Whether the concrete implementation is using a sorted container or not should not affect the operations that we should be able to perform.
There are several possible implementations, most popular of which uses binary heap(that in a way is not sorted), but one way also use a sorted list for example. I think maybe wherever you heard about unsorted priority queue the person may have meant a priority queue that is not implemented using a sorted list or other sorted container.

There is a conceptual difference between a Queue and a Priority Queue. A Queue is a first-in, first-out data structure that allows efficient access to both ends of the queue (head and tail). A Priority Queue is an abstract data structure that provides a getBestItem() function, without specificying how (hence abstract).
An Unsorted Priority Queue could refer to a PQ implementation that does no intermittent work (no organization of elements) and implements getBestItem() as a simple, linear search. This makes getBestItem() very inefficient (O(n)), but insert/delete very cheap (O(1)). If Insert/Delete is frequent and getBestItem() is not, this could be a valid choice.

Related

Most efficient way to implement stack and queue together?

What would be the most appropriate way to implement a stack and a queue together efficiently, in a single data structure. The number of elements is infinite. The retrieval and insertion should both happen in constant time.
A doubly linked list, has all the computational complexity attributes you desire, but poor cache locality.
A ring buffer (array) that allows for appending and removing at head and tail has the same complexity characteristics. It uses a dynamic array and requires reallocation, once the number of elements grows beyond it's capacity.
But, similar to an array list / vector generally being faster in practice for sequential access versus a linked list. In most cases it will be faster and more memory efficient than using a doubly linked list implementation.
It is one of the possible implementations for the dequeue abstract data structure, see e.g. the ArrayDeque<E> implementation in Java.
A doubly linked list can solve this problem with all operations taking constant time:
It allows push() or enqueue() by appending the element to the
list in constant time.
It allows pop() by removing the last element in constant time
It allows dequeue() by removing the first element, also in constant time.
A two-way linked list is going to be best for this. Each node in the list has two references: one to the item before it and one to the item after it. The main list object maintains a reference to the item at the front of the list and one at the back of the list.
Any time it inserts an item, the list:
creates a new node, giving it a reference to the previous first or last node in the list (depending on whether you're adding to the front or back).
connects the previous first or last node to point at the newly-created node.
updates its own reference to the first or last node, to point at the new node.
Removing an item from the front or back of the list effectively reverses this process.
Inserting to the front or back of the structure will always be an O(1) operation.

How to update key of a relaxed vertex in Dijkstra's algorithm?

Just like it was asked here,
I fail to understand how we can find the index of a relaxed vertex in the heap.
Programming style-wise, the heap is a black box that abstracts away the details of a priority queue. Now if we need to maintain a hash table that maps vertex keys to corresponding indices in the heap array, that would need to be done in heap implementation, right?
But most standard heaps don't provide a hash table that does such mapping.
Another way to deal with this whole problem is to add the relaxed vertices to the heap regardless of anything. When we extract the minimum we'll get the best one. To prevent the same vertex being extracted multiple times, we can mark it visited.
So my exact question is, what is the typical way (in the industry) of dealing with this problem?
What are the pros and cons compared what the methods I mentioned?
Typically, you'd need a specially-constructed priority queue that supports the decreaseKey operation in order to get this to work. I've seen this implemented by having the priority queue explicitly keep track of a hash table of the indices (if using a binary heap), or by having an intrusive priority queue where elements stored are nodes in the heap (if using a binomial heap or Fibonacci heap, for example). Sometimes, the priority queue's insertion operation will return a pointer to the node in the priority queue that holds the newly-added key. As an example, here is an implementation of a Fibonacci heap that supports decreaseKey. It works by having each insert operation return a pointer to the node in the Fibonacci heap, which makes it possible to look up the node in O(1), assuming you keep track of the returned pointers.
Hope this helps!
You are asking some very valid questions but unfortunately they are kind of vague so we won't be able to give you a 100% solid "industry standard" answer. However, I'll try to go over your points anyway:
Programming style-wise, the heap is a black box that abstracts away the details of a priority queue
Technically, a priority queue is the abstract interface (insert elements with a priority, extract the lowest priority element) and a heap is a concrete implementation (array-based heap, binomial heap, fibonacci heap, etc).
What I'm trying to say is that using an array is only one particular way to implement a priority queue.
Now if we need to maintain a hash table that maps vertex keys to corresponding indices in the heap array, that would need to be done in heap implementation, right?
Yes, because everytime you move an element inside the array you will need to update the index in the hash table.
But most standard heaps don't provide a hash table that does such mapping.
Yes. This can be very annoying.
Another way to deal with this whole problem is to add the relaxed vertices to the heap regardless of anything.
I guess that could work but I dont think I ever saw anyone do that. The whole point of using a heap here is to increase performance and by adding redundant elements to the heap you kind of go against that. Sure, you preserve the "black-boxness" of the priority queue but I don't know if that is worth it. Additionally, there could be a chance that the extra pop_heap operations could negatively affect your asymptoptic complexity but I'd have to do the math to check.
what is the typical way (in the industry) of dealing with this problem?
First of all, ask yourself if you can get away with using a dumb array instead of a priority queue.
Sure, finding the minimum element in now O(N) instead of O(log n) but the implementation is the simplest (an advantage on its own). Additionally, using an array will be just as efficient if your graph is dense and even if your graph is sparse it might be efficient enough depending on how big your graph is.
If you really need a priority queue, then you are going to have to find one that has a decreaseKey operation implemented. If you can't find one, I would say its not that bad to implement it yourself - it might be less trouble than trying to find an existing implementation and then trying to fit it in with the rest of your code.
Finally, I would not recommend using the really fancy heap data structures (such as fibonacci heaps). While these often show up in textbooks as a way to get optimal asymptotics, in practice they have terrible constant factors and these constant factors are significant when compared with something that is logarithmic.
Programming style-wise, the heap is a black box that abstracts away the details of a priority queue.
Not necessarily. Both C++ and Python have heap libraries that provide functions on arrays rather than black box objects. Go abstracts a bit, but requires the programmer to provide an array-like data structure for its heap operations to work on.
All this abstraction leaking in standardized, industry-strength libraries has a reason: some algorithms (Dijkstra) require a heap with additional operations, which would degrade the performance of other algorithms. Yet other algorithms (heapsort) need heap operations that work in-place on input arrays. If your library's heap gives you a black-box object, and it doesn't suffice for some algorithm, then it's time to re-implement the operations as function on arrays, or find a library that does have the operations you need.
This is a great question and one of those details that algorithms books like CLRS just glaze over without mention.
There are a few ways to do handle this, either:
Use a custom heap implementation that supports decreaseKey operations
Every time you "relax" a vertex, you just add it back into the heap with the new lower weight, then you write a custom way to ignore the old elements later. You can take advantage of the fact that you only ever add a node into the heap/priority-queue if the weight has decreased.
Option #1 is definitely used. For example, if you are familiar with OpenSourceRoutingMachine (OSRM) it searches over graphs with many millions of nodes to compute road routing directions. It uses a Boost implementation of a d-ary heap specifically because it has better decreaseKey operations, source. Often the Fibonacci_heap is also mentioned for this purpose because it supports O(1) decrease key operations, but likewise you'd probably have to roll your own.
In option #2 you end up doing more insertions and removeMin operations in total. If D is the total number of "relax" operations you must do, you end up doing a total of D additional heap operations. So while this has a theoretically worse runtime complexity, in practice there is research evidence that option #2 can be more performant because you can take advantage of cache locality and avoid the additional overhead of keeping pointers to do the decreaseKey operations (see [1], specifically pg. 16). This approach also has the advantage of being simpler and allows you to use standard library heap/priority-queue implementations in most languages.
To give you some psuedocode for how option #2 would look:
// Imagine this is some lookup table that has the minimum weight
// so far for each node.
weights = {}
while Queue is not empty:
u = Queue.removeMin()
// This is our new logic to discard the duplicate entries.
if u.weight > weights[u]:
continue
visit neighbors[u] and relax() each one
As an alternative, you can also check out the the Python standard library heapq docs which describe another approach to keeping track of "dead" entries in the heap. Whether you find it helpful depends on what data structure you are using for your graph representation and storing of vertex distances.
[1] Priority Queues and Dijkstra’s Algorithm 2007

What is the ideal data structure (Priority Queue seems insufficient) for frontier in Uniform-cost search?

Very often we need to discard repeated states, as stated in uniform-cost search.
if n is in frontier with higher cost
replace existing node with n
Priority Queue doesn't provide an interface for search an item for its priority and then update it. I am surprised I cannot find any resource regarding this, any one can offer help please.
You are looking for a Priority Search Queue.
A priority search queue efficiently supports the
opperations of both a search tree and a priority queue. A Binding is a
product of a key and a priority. Bindings can be inserted, deleted,
modified and queried in the queue (usually in logarithmic time), and the binding with the
least priority can be retrieved in constant time.
Here is an implementation in Haskell.
Many Priority Queue implementations allow keeping some reference to queue element and then use it to delete/update this element.
You can easily keep such references if you implement Priority Queue as a binary search tree. For Binary Heap this is possible, but more difficult: you'll need to update references for all elements, moved upheap or downheap.
There are Priority Queue implementations, allowing efficient update of elements when used with algorithms like uniform-cost search. See Pairing heap and Fibonacci heap.
Actually, you can get away with a regular priority queue for uniform-cost search.
You can insert a new, better (node, cost) pair without deleting the old one. You will always process the newly inserted entry first (because it's better) and processing the older entry will effectively be a no-op. The downside is that you may end up with O(E) elements in the priority queue (instead of O(V)).

Data Structure Creation (PQ linked list merge?)

So I need to find a data structure for this situation that I'll describe:
This is not my problem but explains the data structure aspect i need more succinctly:
I have an army made up of platoons. Every platoon has a number of men and a rank number(highest being better). If an enemy were to attack my army, they would kill some POWER of my army, starting from the weakest platoon and working up, where it takes (platoon rank) amount of power to kill every soldier from a platoon.
I could easily simulate enemies attacking me by peeking and popping elements from my priority queue of platoons, ordered by rank number, but that is not what I need to do. What i need is to be able to allow enemies to view all the soldiers they would kill if they attacked me, without actually attacking, so without actually deleting elements from my priorityqueue(if i implemented it as a pq).
Sidenote: Java's PriorityQueue.Iterator() prints elements in a random order, I know an iterator is all I need, just fyi.
The problem is, if I implemented this as a pq, I can only see the top element, so I would have to pop platoons off as if they were dying and then push them back on when the thought of the attack has been calculated. I could also implement this as a linked list or array, but insertion takes too long. Ultimately I would love to use a priority queue I just need the ability to view either the (pick an index)'th element from the pq, or to have every object in the pq have a pointer to the next object in the pq, like a linked list.
Is this thought about maintaining pointers with a pq like a linked list possible within java's PriorityQueue? Is it implemented for me somewhere in PriorityQueue that I dont know about? is the index thing implemented? is there another data structure I can use that can better serve my purpose? Is it realistic for me to find the source code from Java's PriorityQueue and rewrite it on my machine to maintain these pointers like a linked list?
Any ideas are very welcome, not really sure which path I want to take on this one.
One thing you could do is an augmented binary search tree. That would allow efficient access to the nth smallest element while still keeping the elements ordered. You could also use a threaded binary search tree. That would allow you to step from one element to the next larger one in constant time, which is faster than in a normal binary tree. Both of these data structures are slower than a heap, though.

position index for binary heap priority queues?

So let's say I have a priority queue of N items with priorities, where N is in the thousands, using a priority queue implemented with a binary heap. I understand the EXTRACT-MIN and INSERT primitives (see Cormen, Leiserson, Rivest which uses -MAX rather than -MIN).
But DELETE and DECREASE-KEY both seem to require the priority queue to be able to find an item's index in the heap given the item itself (alternatively, that index needs to be given by consumers of the priority queue, but this seems like an abstraction violation).... which looks like an oversight. Is there a way to do this efficiently without having to add a hashtable on top of the heap?
Right, I think the point here is that for the implementation of the priority queue you may use a binary heap who's API takes an index (i) for its HEAP-INCREASE-KEY(A, i, key), but the interface to the priority queue may be allowed to take an arbitrary key. You're free to have the priority queue encapsulate the details of key->index maps. If you need your PQ-INCREASE-KEY(A, old, new) to to work in O(log n) then you'd better have a O(log n) or better key to index lookup that you keep up to date. That could be a hash table or other fast lookup structure.
So, to answer your question: I think it's inevitable that the data structure be augmented some how.
FWIW, and if someone still comes looking for something similar — I recently chanced upon an implementation for an Indexed priority queue while doing one of the Coursera courses on Algorithms.
The basic gist is to incorporate a reverse lookup using 2 arrays to support the operations that the OP stated.
Here's a clear implementation for Min Ordered Indexed Priority Queue.
"But DELETE and DECREASE-KEY both seem to require the priority queue to be able to find an item's index in the heap given the item itself" -- it's clear from the code that at least a few of these methods use an index into the heap rather than the item's priority. Clearly, i is an index in HEAP-INCREASE-KEY:
HEAP-INCREASE-KEY(A, i, key)
if key < A[i]
then error 'new key is smaller than current key"
A[i] <-- key
...
So if that's the API, use it.
I modified my node class to add a heapIndex member. This is maintained by the heap as nodes are swapped during insert, delete, decrease, etc.
This breaks encapsulation (my nodes are now tied to the heap), but it runs fast, which was more important in my situation.
One way is to split up the heap into the elements on one side and the organization on the other.
For full functionality, you need two relations:
a) Given a Heap Location (e.g. Root), find the Element seated there.
b) Given an Element, find its Heap Location.
The second is very easy: add a value "location" (most likely an index in an array-based heap) that is updated every time the element is moved in the heap.
The first is also simple: instead of storing Elements, you simply keep a heap of pointers to Elements (or array indeces). Now, given a Location (e.g. Root), you can find the Element seated there by dereferencing it (or accessing the vector).
But DELETE and DECREASE-KEY both seem to require the priority queue to be able to find an item's index in the heap given the item itself
Actually, that's not true. You can implement these operations in an unindexed graph, linked-lists and 'traditional' search trees by having predecessor(s) and successor(s) pointers.

Resources