Priority queue with dynamic item priorities - algorithm

I need to implement a priority queue where the priority of an item in the queue can change and the queue adjusts itself so that items are always removed in the correct order. I have some ideas of how I could implement this but I'm sure this is quite a common data structure so I'm hoping I can use an implementation by someone smarter than me as a base.
Can anyone tell me the name of this type of priority queue so I know what to search for or, even better, point me to an implementation?

Priority queues such as this are typically implemented using a binary heap data structure as someone else suggested, which usually is represented using an array but could also use a binary tree. It actually is not hard to increase or decrease the priority of an element in the heap. If you know you are changing the priority of many elements before the next element is popped from the queue you can temporarily turn off dynamic reordering, insert all of the elements at the end of the heap, and then reorder the entire heap (at a cost of O(n)) just before the element needs to be popped. The important thing about heaps is that it only costs O(n) to put an array into heap order but O(n log n) to sort it.
I have used this approach successfully in a large project with dynamic priorities.
Here is my implementation of a parameterized priority queue implementation in the Curl programming language.

A standard binary heap supports 5 operations (the example below assume a max heap):
* find-max: return the maximum node of the heap
* delete-max: removing the root node of the heap
* increase-key: updating a key within the heap
* insert: adding a new key to the heap
* merge: joining two heaps to form a valid new heap containing all the elements of both.
As you can see, in a max heap, you can increase an arbitrary key. In a min heap you can decrease an arbitrary key. You can't change keys both ways unfortunately, but will this do? If you need to change keys both ways then you might want to think about using a a min-max-heap.

I would suggest first trying the head-in approach, to update a priority:
delete the item from the queue
re-insert it with the new priority
In C++, this could be done using a std::multi_map, the important thing is that the object must remember where it is stored in the structure to be able to delete itself efficiently. For re-insert, it's difficult since you cannot presume you know anything about the priorities.
class Item;
typedef std::multi_map<int, Item*> priority_queue;
class Item
{
public:
void add(priority_queue& queue);
void remove();
int getPriority() const;
void setPriority(int priority);
std::string& accessData();
const std::string& getData() const;
private:
int mPriority;
std::string mData;
priority_queue* mQueue;
priority_queue::iterator mIterator;
};
void Item::add(priority_queue& queue)
{
mQueue = &queue;
mIterator = queue.insert(std::make_pair(mPriority,this));
}
void Item::remove()
{
mQueue.erase(mIterator);
mQueue = 0;
mIterator = priority_queue::iterator();
}
void Item::setPriority(int priority)
{
mPriority = priority;
if (mQueue)
{
priority_queue& queue = *mQueue;
this->remove();
this->add(queue);
}
}

I am looking for just exactly the same thing!
And here is some of my idea:
Since a priority of an item keeps changing,
it's meaningless to sort the queue before retrieving an item.
So, we should forget using a priority queue. And "partially" sort the
container while retrieving an item.
And choose from the following STL sort algorithms:
a. partition
b. stable_partition
c. nth_element
d. partial_sort
e. partial_sort_copy
f. sort
g. stable_sort
partition, stable_partition and nth_element are linear-time sort algorithms, which should be our 1st choices.
BUT, it seems that there is no those algorithms provided in the official Java library. As a result, I will suggest you to use java.util.Collections.max/min to do what you want.

Google has a number of answers for you, including an implementation of one in Java.
However, this sounds like something that would be a homework problem, so if it is, I'd suggest trying to work through the ideas yourself first, then potentially referencing someone else's implementation if you get stuck somewhere and need a pointer in the right direction. That way, you're less likely to be "biased" towards the precise coding method used by the other programmer and more likely to understand why each piece of code is included and how it works. Sometimes it can be a little too tempting to do the paraphrasing equivalent of "copy and paste".

Related

Can you reverse a queue by only reversing its head and tail pointers?

While attempting to reverse a queue, I found a generally agreed upon way:
You can dequeue through the queue, getting the dequeue value and pushing each one into a stack. Then you can go through that stack, popping each value and enqueueing it into the queue
By agreed upon, I mean most of my Google searches on reversing a queue end up taking me to that solution.
Even though that way is correct and relatively performant in linear time, I believe there's a better way that is simpler and more performant in constant time.
Assuming that a queue is implemented using a doubly-linked list, can't you reverse it in O(1) time by just reversing the head and tail pointers?
if you want to treat a doubly linked list as a queue, then it's only by convention which is the head and which is the tail by which way you want to iterate it. But the point of a queue is it's interface.... so given any arbitrary queue, implemented in an unknown way ( there are MANY things that implement queues, including queues that distribute themselves across many computers ) the question is, how can you you reverse it, and that means you cannot rely on an underlying implementation.
A specific implementation might implement optimizations for certain operations.
No, you cannot just swap the head and tail pointers of a doubly linked list. You also need to swap the next and previous pointers in each node. This will still take O(n) time.
Short Answer : NO
LONG ANSWER/REASON
The queue is an abstract data type. That means there is no physical existence of such a data structure. A queue can be implemented in many ways. The most basic implementation is the one that uses arrays.
struct Queue{
int elements[50];//The actual/physical (array)data structure which houses the data
int max=50;//The maximum no of elements that can be stored in this queue.
int front,rear;//pointers to the front and rear.
};
Now, I am sure you know how to define operations on this kind of a Queue.
void enQ(Queue &Q,int new_element);
int deQ(Queue &Q);
int getFront(Queue Q);
That means if I have to add an element 7 to the queue identified by Q, I need to execute enQ(Q,7). Let us say I have added 10 items like that by calling enQ 10 times. Then I add a number 89 to the queue. Now if I have to get this number 89(assume that all numbers are unique), I will first have to deQ the first 10 items and the call the deQ function again to get 89. I am sure you will agree, that is the principle of a queue.
Now time for some magic. If I knew that 89 is the 11th number I added, I can get it directly by Q.elements[(Q.front+11)%Q.max]. Also if I knew that 89 is the number I just added, I can also get it by using Q.elements[Q.rear].
Wow! Does that mean that the principles of the queue got violated? No. It just means that I am not using a queue anymore. I am actually using an array, but trying to fool myself, by doing it in a sophisticated manner(by putting it in a structure and all that).
If you are using a queue, You can only use the three methods I mentioned above. You are not allowed to meddle with the inner workings of the Queue. You might be thinking that your case is different because you are just wanting to change the front and rear values and not the actual data. But No. In a real queue, you are not even allowed to access the front and rear. All you have access to are the three methods I defined above.
That is why the actual implementation of a queue should be
class Queue{
int elements[50];//The actual/physical (array)data structure which houses the data
int max=50;//The maximum no of elements that can be stored in this queue.
int front,rear;//pointers to the front and rear.
public:
void enQ(int new_element);
int deQ();
int getFront();
};
Now we are upholding the real essence of a Queue. A similar layout should be used if you are implementing a queue using a Doubly-linked-list. The front and rear pointers should be private and inaccessible to the user.
Therefore it is not possible to reverse a QUEUE faster than O(n).
So the bottom line is: If you want to change the queue pointers, by using a doubly-linked list, by all means you can do it. But you cannot call it reversing a queue. Because, then you would not be using a queue. In fact, that would be a completely new data structure called DEQ. If you really want to implement reversing a queue in O(1) time complexity, I suggest you go ahead with your method. But you will have to stop calling it a queue because that is a DEQ(BTW, there is nothing wrong with using a DEQ, by all means, use it). Or if you don't like the sounding of a DEQ, you can define your own data structure called reversible queue.
You can define your data structure like
class ReversibleQueue{
DLL front,rear; //pointers to a DOUBLY-LINKED-LIST
public:
void enQ(int new_element);
int deQ();
int getFront();
void reverse();
};

Why we are not saving the parent pointer in "B+ Tree" for easy upward traversal in tree?

Will it affect much if I add a pointer to the parent Node to get simplicity during splitting and insertion process?
General Node would then look something like this :
class BPTreeNode{
bool leaf;
BPTreeNode *next;
BPTreeNode *parent; //add-on
std::vector < int* >pointers;
std::vector < int >keys;
};
What are the challenges I might get in real life database system since right now.
I am only implementing it as a hobby project.
There are two reasons I can think of:
The algorithm for deleting a value from a B+tree may result in an internal block A that has too few child blocks. If neither the block at the left or right of A can pass an entry to A in order to resolve this violation, then block A needs to merge into a sibling block B. This means that all the child blocks of block A need to have their parent pointer updated to block B. This is additional work that increases (a lot) the number of blocks that need an update in a delete-operation.
It represents extra space that is really not needed for performing the standard B+Tree operations. When searching a value via a B+Tree you can easily keep track of the path to that leaf level and use it for backtracking upwards in the tree.

Why Use A Doubly Linked List and HashMap for a LRU Cache Instead of a Deque?

I have implemented the design a LRU Cache Problem on LeetCode using the conventional method (Doubly Linked List+Hash Map). For those unfamiliar with the problem, this implementation looks something like this:
I understand why this method is used (quick removal/insertion at both ends, fast access in the middle). What I am failing to understand is why someone would use both a HashMap and a LinkedList when one could simply use a array-based deque (in Java ArrayDeque, C++ simply deque). This deque allows for ease of insertion/deletion at both ends, and quick access in the middle which is exactly what you need for an LRU cache. You also would use less space because you wouldn't need to store a pointer to each node.
Is there a reason why the LRU cache is almost universally designed (on most tutorials at least) using the latter method as opposed to the Deque/ArrayDeque method? Would the HashMap/LinkedList method have any benefits?
When an LRU cache is full, we discard the Least Recently Used item.
If we're discarding items from the front of the queue, then, we have to make sure the item at the front is the one that hasn't been used for the longest time.
We ensure this by making sure that an item goes to the back of the queue whenever it is used. The item at the front is then the one that hasn't been moved to the back for the longest time.
To do this, we need to maintain the queue on every put OR get operation:
When we put a new item in the cache, it becomes the most recently used item, so we put it at the back of the queue.
When we get an item that is already in the cache, it becomes the most recently used item, so we move it from its current position to the back of the queue.
Moving items from the middle to the end is not a deque operation and is not supported by the ArrayDeque interface. It's also not supported efficiently by the underlying data structure that ArrayDeque uses. Doubly-linked lists are used because they do support this operation efficiently.
The purpose of an LRU cache is to support two operations in O(1) time: get(key) and put(key, value), with the additional constraint that least recently used keys are discarded first. Normally the keys are the parameters of a function call and the value is the cached output of that call.
Regardless of how you approach this problem we can agree that you MUST use a hashmap. You need a hashmap to map a key already present in the cache to the value in O(1).
In order to deal with the additional constraint of least recently used keys being discarded first you can use a LinkedList or ArrayDeque. However since we don't actually need to access the middle, a LinkedList is better since you don't need to resize.
Edit:
Mr. Timmermans discussed in his answer why ArrayDeques cannot be used in an LRU cache due to the necessity of moving elements from the middle to the end. With that being said here is an implementation of an LRU cache that successfully submits on leetcode using only appends and poplefts in the deque. Note that python's collections.deque is implemented as a doubly linked list, however we are only using operations in collections.deque that are also O(1) in a circular array, so the algorithm stays the same regardless.
from collections import deque
class LRUCache:
def __init__(self, capacity: 'int'):
self.capacity = capacity
self.hashmap = {}
self.deque = deque()
def get(self, key: 'int') -> 'int':
res = self.hashmap.get(key, [-1, 0])[0]
if res != -1:
self.put(key, res)
return res
def put(self, key: 'int', value: 'int') -> 'None':
self.add(key, value)
while len(self.hashmap) > self.capacity:
self.remove()
def add(self, key, value):
if key in self.hashmap:
self.hashmap[key][1] += 1
self.hashmap[key][0] = value
else:
self.hashmap[key] = [value, 1]
self.deque.append(key)
def remove(self):
k = self.deque.popleft()
self.hashmap[k][1] -=1
if self.hashmap[k][1] == 0:
del self.hashmap[k]
I do agree with Mr. Timmermans that using the LinkedList approach is preferable - but I want to highlight that using an ArrayDeque to build an LRU cache is possible.
The main mixup between myself and Mr. Timmermans is how we interpreted capacity. I took capacity to mean caching the last N get / put requests, while Mr. Timmermans took it to mean caching the last N unique items.
The above code does have a loop in put which slows the code down - but this is just to get the code to conform to caching the last N unique items. If we had the code cache the last N requests instead, we could replace the loop with:
if len(self.deque) > self.capacity: self.remove()
This will make it as fast if not faster than the linked-list variant.
Regardless of what maxsize is interpreted as, the above method still works as an LRU cache - least recently used elements get discarded first.
I just want to highlight that the designing an LRU cache in this manner is possible. The source is right there - try to submit it on Leetcode!
Doubly linked list is the implementation of the queue. Because doubly linked lists have immediate access to both the front and end of the list, they can insert data on either side at O(1) as well as delete data on either side at O(1). Because doubly linked lists can insert data at the end in O(1) time and delete data from the front in O(1) time, they make the perfect underlying data structure for a queue. Queeus are lists of items in which data can only be inserted at the end and removed from the beginning.
Queues are an example of an abstract data type, and that we are able to use an array to implement them under the hood. Now, since queues insert at the end and delete from the beginning, arrays are only so good as the underlying data structure. While arrays are O(1) for insertions at the end, they’re O(N) for deleting from the beginning. A doubly linked list, on the other hand, is O(1) for both inserting at the end and for deleting from the beginning. That’s what makes it a perfect fit for serving as the queue’s underlying data structure.
Pyhon deque uses a linked list as part of its data structure. This is the kind of linked list it uses. With doubly linked lists, deque is capable of inserting or deleting elements from both ends of a queue with constant O(1) performance. pyhton-deque

what's the advantages of using the data structure of static linked list

In this question, static linked list is defined as the following:(c++ code)
template<typename T> struct Node{
T elem;
int next;//yes, int, which points to the index of the next element in the array.
};
Node static_linked_list [SOME_SIZE];
//some initialization code omitted.
So in this kinds of linked list, it is static because its size is allocated during the array initialization. The link is achieved through the field, int next, which points to the index of the next element.
What's the advantages of this data structure over the pointer(or, reference) based linked list? What's its application? As far as I know, the static one has scoped lifetime, and may be used when implementing malloc. But its int next doesn't seem to have less memory expenditure than pointers.
I'm not sure what this particular structure is being used for, but the unusual technique does offer the ability to behave like a linked list while using a capped and fixed, pre-allocated memory block that doesn't need to be managed besides updating indexes in the elements as required. (Note that of course the index field doesn't have to "point" to the next numeric index, it can point to any index, thus the "list" does not need to be stored in semantic order.)
It's faster to "remove" an item than if were a simple array (which would require shifting later elements). Adding items is trickier and is obviously limited by size of the overall element array, but could be sped up with some look-aside bookkeeping. I'm not sure under what exact circumstances you'd decide you needed this particular data structure over a different kind of list. My guess is you'd be under pretty cautious memory constraints where predictability was king: think gaming consoles, embedded devices, drivers/operating system layers, etc.

Sorting two-way cell connections with priority

I have a two dimensional grid of cells. In this simulation, cells may request to switch position with another cell. The request also has a priority.
The problem is that Im having a hard time coming up with a good way to structure this. If A wants to switch with B, and B also wants to switch with A, they currently can be switched and switched back in a single logic tick (which should be impossible).
The solution probably involves making sure (A to B)==(B to A) and insertion sorting them into a list by their priority.
Does such data structure have a name? Anyone recognise the problem and can provide some good links for reading?
I can't say that I've come across an example like this before, so I don't know what it would be called, but perhaps something like this would work...
Cell - a class or struct
CellId
XCoordinate
YCoordinate
SwitchRequest - a class or struct
RequestingCell
TargetCell
Priority
CanSwitch
SwitchRequests - an array of SwitchRequests
AlreadySwitchedCells - an array of Cells
Algorithm
For each tick:
clear AlreadySwitchedCells
build list of SwitchRequests
sort SwitchRequests by Priority (highest to lowest)
loop through each SwitchRequest
{
if (RequestingCell is not in AlreadySwitchedCells and TargetCell is not in AlreadySwitchedCells)
{
add RequestingCell and TargetCell to AlreadySwitchedCells
SwapCellIds(RequestingCell, TargetCell)
}
}
Note: There are some options here, like whether you should make the coordinates properties of a Cell or just store the CellIds in an two-dimensional array, but hopefully this gives you a starting point.

Resources