Is deleting random node from a heap possible? - data-structures

I have a situation where I want to delete a random node from the heap, what choices do I have? I know we can easily delete the last node and the first node of the heap. However if we say delete the last node, then I am not sure if the behavior is correctly defined for deleting a random node from the heap.
e.g.
_______________________
|X|12|13|14|18|20|21|22|
------------------------
So in this case I can delete the node 12 and 22, this is defined, but can I for example delete a random node, e.g. say 13, and still somehow maintain the complete tree property of the heap (along with other properties)?

I'm assuming that you're describing a binary heap maintained in an array, with the invariant that A[N] <= A[N*2] and A[N] <= A[N*2 + 1] (a min-heap).
If yes, then the approach to deletion is straightforward: replace the deleted element with the last element, and perform a sift-down to ensure that it ends up in the proper place. And, of course, decrement the variable that holds the total number of entries in the heap.
Incidentally, if you're working through heap examples, I find it better to use examples that do not have a total ordering. There's nothing in the definition of a heap that requires (eg) A[3] <= A[5], and it's easy to get misled if your examples have such an ordering.

I don't this think it is possible to remove random element from a heap. Let's take this example (following same convention):
3, 10, 4, 15, 20, 6, 5.
Now if I delete element 15, the heap becomes: 3, 10, 4, 5, 20, 6
This makes heap inconsistent because of 5 being child of 10.
The reason I think random deletion won't work is because you may substitute an inside node (instead of root or a leaf) in the heap, and thus there are two paths (parents and children) to heapify (as compared to 1 path in case of pop() or insert()).
Please let me know in case I am missing something here.

Related

Did I correctly perform extract max operation on this max-heap?

I am trying to understand how heaps work.
I have the following heap:
Now I want to extract the max value.
The first thing I do is delete the root 42, then put the last element in the heap (6) at the root position. I then perform max-heapify to find the correct spot for 6.
6 i larger than its two children, so I swap it with the largest child 41, making 41 the new root.
6 now has the children 3 and 9, I therefore again swap it with the larger child 9
In the end I end up with the heap
Did I correctly perform extract-max?
Yes!
Extract max works recursively.find the largest element among the three i.e parent and two of their children.If largest is not parent swap largest element to parent and call extract max to largest.

Why isn't heapsort stable?

I'm trying to understand why heapsort isn't stable.
I've googled this, but haven't found a good, intuitive explanation.
I understand the importance of stable sorting - it allows us to sort based on more than one key, which can be very beneficial (i.e., do multiple sortings, each based on a different key. Since every sort will preserve the relative order of elements, previous sortings can add up to give a final list of elements sorted by multiple criteria).
However, why wouldn't heapsort preserve this as well?
Thanks for your help!
Heap sort unstable example
Consider array 21 20a 20b 12 11 8 7 (already in max-heap format)
here 20a = 20b just to differentiate the order we represent them as 20a and 20b
While heapsort first 21 is removed and placed in the last index then 20a is removed and placed in last but one index and 20b in the last but two index so after heap sort the array looks like
7 8 11 12 20b 20a 21.
It does not preserve the order of elements and hence can't be stable
The final sequence of the results from heapsort comes from removing items from the created heap in purely size order (based on the key field).
Any information about the ordering of the items in the original sequence was lost during the heap creation stage, which came first.
Stable means if the two elements have the same key, they remain in the same order or positions. But that is not the case for Heap sort.
Heapsort is not stable because operations on the heap can change the relative order of equal items.
From here:
When sorting (in ascending order) heapsort first peaks the largest
element and put it in the last of the list. So, the element that have
been picked first, stays last and the element that have been picked
second stays to the second last element in the sorted list.
Again, Build-Max-Heap procedure works such that it preserve the order
of same value (ex:3a,3b) in building the heap tree. For extracting
the maximum element it also works from the root and try to preserve
the structure of the tree (except the change for Heapify).
So, what happens, for elements with same value [3a,3b] heapsort picks
3a before 3b but puts 3a to the right of 3b. So, As the list is
sorted in ascending order we get 3b before 3a in the list .
If you try heapsort with (3a,3b,3b) then you can visualize the
situation.
Stable sort algorithms sort elements such that order of repeating elements in the input is maintained in the output as well.
Heap-Sort involves two steps:
Heap creation
Removing and adding the root element from heap tree into a new array which will be sorted in order
1. Order breaks during Heap Creation
Let's say the input array is {1, 5, 2, 3, 2, 6, 2} and for the purpose of seeing the order of 2's, say they are 2a, 2b and 2c so the array would be {1, 5, 2a, 3, 2b, 6, 2c}
Now if you create a heap (min-heap here) out of it, it's array representation will be {1, 2b, 2a, 3, 5, 6, 2c} where order of 2a and 2b has already changed.
2. Order breaks during removal of root element
Now when we have to remove root element (1 in our case) from the heap to put it into another new array, we swap it with the last position and remove it from there, hence changing the heap into {2c, 2b, 2a, 3, 5, 6}. We repeat the same and this time we will remove '2c' from the heap and put it at end of the array where we had put '1'.
When we finish repeating this step until the heap is empty and every element is transferred to the new array, the new array (sorted) it will look like {1, 2c, 2b, 2a, 3, 5, 6}.
Input to Heap-Sort: {1, 5, 2a, 3, 2b, 6, 2c} --> Output: {1, 2c, 2b, 2a, 3, 5, 6}
Hence we see that repeating elements (2's) are not in same order in heap-sorted array as they appear in the input and therefore Heap-Sort is not stable !
I know this is a late answers but I will add my 2 cents here.
Consider a simple array of 3 integers. 2,2,2 now if you build a max heap using build max heap function, you will find that the array storing the input has not changed as it is already in Max heap form. Now when we put the root of the tree at the end of the array in first iteration of heap sort the stability of array is already gone. So there you have an simple example of instability of heap sort.
Suppose take an array of size n (arbitrary value) and if there are two consecutive elements(assume 15) in heap and if their parent indices have values like 4 and 20.(this is the actual order (....4,20 ,.....,15,15.....). the relative order of 4 and 1st 15 remains same but as 20>15,the 2nd 15 comes to front(swap) as defined in heap sort algorithm, the relative order is gone.

Heapsort using multiple heaps

I found a variant of Heapsort using multiple heaps at http://students.ceid.upatras.gr/~lebenteas/Heapsort-using-Multiple-Heaps-final.pdf. The solution proposes that instead of the traditional Heapsort algorithm, where after each swap, we do another siftdown to bring the highest value in the current heap to the root, we can do some other things. However, what exactly do they mean by 'other things', I cannot understand.
For example, at one point they say We “forget”, for the time being, the existence of the root. That surely means we are currently stalling the swapping of the highest element with the last element of the heap. However, just after some lines, they say So far, two elements have been transferred in the sorted part of the heap., which runs counter to the proposition that the swapping hasn't been done yet. Also in the figure in page 97, the node with value 1 is missing, I don't know how.
Can anybody give me an idea of what exactly is the authors trying to convey, and how worthwhile can it be?
(The line you asked about is in section 2.3, so I will explain the variation of heapsort which is proposed in section 2.3:)
When the author says we "forget" the existence of the root, this does not mean that they are stalling the swapping of the highest element. The swap is done, but they temporarily delay rebuilding the heap. After swapping the highest element into the root position, they compare the roots of the 2 subheaps, and swap one or the other with the next-highest element. Then, after doing 2 swaps (rather than 1), they rebuild the heap.
Then they take this idea a step further in sections 3 and 4, and propose another variant of heapsort, which uses more than one heap.
How do you keep more than one heap in an array? (To make it concrete, let's talk about 2 heaps.) Well, how do you keep a single heap? The root goes at index 0, its children are at 1 and 2, then the children of the left subheap are at 3 and 4, etc., right?
To put 2 heaps together in an array, keep the 2 roots at 0 and 1. The children of the first root go at 2 and 3, then the children of the 2nd root at 4 and 5... with such an arrangement, it is still possible to navigate up and down the tree by doing simple arithmetic operations on indexes.
The standard heapsort repeats 2 steps: swap the root with the last element in the "heap" area, then siftDown to rebuild the heap. This heapsort repeats the following 3 steps: compare the 2 roots to see which one is bigger, swap that one with the last element in the "heap" area, then call siftDown on the appropriate heap.
This requires an extra compare at each step, but the siftDown operations work on slightly shallower heaps, which saves more than a single compare.

How worthwhile is this optimization for Heapsort?

The traditional Heapsort algorithm swaps the last element of the heap with the root of the current heap after every heapification, then continues the process again. However, I noticed that it is kind of unnecessary.
After a heapification of a sub-array, while the node contains the highest value (if it's a max-heap), the next 2 elements in the array must follow the root in the sorted array, either in the same order as they are now, or exchanging them if they are reverse-sorted. So instead of just swapping the root with the last element, won't it be better to swap the first 3 elements (including the node and after the if necessary exchange of the 2nd and 3rd elements) with the last 3 elements, so that 2 subsequent heapifications (for the 2nd and 3rd elements ) are dispensed with?
Is there any disadvantage with this method (apart from the if-needed swapping of the 2nd and 3rd elements, which should be trivial)? If not, if it is indeed better, how much performance boost will it give? Here is the pseudo-code:
function heapify(a,i) {
#assumes node i has two nodes, both heaps. However node i itself might not be a heap, i.e one of its children may be greater than its value.
#if so, find the greater of its two children, then swp the parent with that value.
#this might render that child as no longer a heap, so recurse
}
function create_heap(a) {
#all elements following index |_n/2_| are leaf nodes, thus heapify() should be applied to all elements within index 1 to |_n/2_|
}
function heapsort(a) {
create_heap(a); #a is now a max-heap
#root of the heap, that is a[1] is the maximum, so swap it with a[n].
#The root now contains an element smaller than its children, both of which are themselves heaps.
#so apply heapify(a,1). Note: heap length is now n-1, as a[n] is the largest element already found
#now again the array is a heap. The highest value is in a[1]. Swap it with a[n-1].
#continue
}
Suppose the array is [4,1,3,2,16,9,10,14,8,7]. After running a heapify, it will become [16,14,10,8,7,9,3,2,4]. Now the heapsort's first iteration will swap 16 and 4, leading to [4,14,10,8,7,9,3,2,16]. Since this has now rendered the root of the new heap [4,14,10,8,7,9,3,2] as, umm, un-heaped, (14 and 10 both being greater than 4), run another heapify to produce [14,8,10,4,7,9,3,2]. Now 14 being the root, swap it with 2 to yield [2,8,10,4,7,9,3,14], thus making the array currently [2,8,10,4,7,9,3,14,16]. Again we find that 2 is un-heaped, so again doing a heapify makes the heap as [10,8,9,4,7,2,3]. Then 10 is swapped with 3, making the array as [3,8,9,4,7,2,3,10,14,16]. My point is that instead of doing the 2nd and 3rd heapifications to store 10 and 14 before 16, we can tell from the first heapification that because 10 and 14 follow 16, they are the 2nd and 3rd largest elements (or vice-versa). So after a comparison between them (in case they are already sorted, 14 comes before 10), I swap all the there (16,14,10) with (3,2,4), making the array [3,2,4,8,7,9,16,14,10]. This reduces us to a similar condition as the one after the further two heapifications - [3,8,9,4,7,2,3,10,14,16] originally, as compared to [3,2,4,8,7,9,16,14,10] now. Both will now need further heapification, but the 2nd method has let us arrive at this juncture directly by just a comparison between two elements (14 and 10).
The second largest element of the heap is present in the second or third position, but the third largest can be present further down, at depth 2. (See the figure in http://en.wikipedia.org/wiki/Heap_(data_structure) ). Furthermore, after swapping the first three elements with the last three, the heapify method would first heapify the first subtree of the root, followed by the second subtree of the root, followed by the whole tree. Thus the total cost of this operation is close to three times the cost of swapping the top element with the last and calling heapify. So you won't gain anything by doing this.

priority queue data structure

Suppose that I have a priority queue which removes elements in increasing order, and stored in this queue are the elements 1, 1, 3, 0, 1. The increasing order is 0 then 1 then 3, but there are three element 1s.
When I call remove it will first remove the 0, but if I call remove again will it remove all three 1s at the same time, or will I need to call remove three separate times to remove all of the 1 elements.
Does a call to remove on such a priority queue remove all elements of the same minimum value or will only one element be removed with each call?
In a priority queue usually the remove operation removes a single record containing the maximum value. So in your case it would be the second option. The order of removal is not guaranteed. Any key with the "maximum" value would be removed. Also, unsorted array is a bad data structure of implement a priority queue. You would typically use a heap data structure to get O(log(n)) guarantees on insertion and removal.
typical heap implementation would always reheap the tree therefore it would remove 0, 1, 1, 1 and then 3 as 1 would get push to the root during reheapification..
am i wrong?
edit: your case is a min-heap

Resources