Insert/remove the same element in heap sort - data-structures

Show the heap at each stage when the following numbers are inserted to an initially empty min-heap in the given order: {11, 17, 13, 4, 4, 1 }. Now, show the heap at each stage when we successively perform the deleteMin operation on the heap until it is empty.
Here is the answer/checkpoint I receive:
![1]https://imgur.com/zu47RIF
I have 2 questions please:
I don't understand when we insert element 4 the second time, why do we shift 11 to make it the right child of the old element/firstly inserted element 4? Is it because we want to satisfy a requirement of the complete binary tree, which is each node in the levels from 1 to k - 2 has exactly 2 children (k = levels of the trees, level k is the bottom-most level)?
I don't understand how we deleteMin = 1, 13 becomes the right child of the newly parent 11 (which is the left child of 4). Just a quick note that my instructor gave the class 2 ways to deleteMin. The other way is fine with me - it's just the reversed process of inserting.

Like you said, the heap shape is an "almost complete tree": all levels are complete, except the lowest level which can be incomplete to the right. Therefore, the second 4 is necessarily added to the right of 17 to preserve the heap shape:
4
/ \
11 13
/ \
17 4
After that, 4 switches places with 11 to regain the min-heap property.
Deletions are typically implemented by removing the root and putting the last (i.e., bottom-rightmost) element in its place. This preserves the heap shape. The new root is then allowed to sift down in order to regain the min-heap property. So 13 becomes the new root:
13
/ \
4 4
/ \
17 11
Then 13 switches places with either child node. It looks like they chose the right-hand child in your example.

Related

List all the keys that could have been the last key inserted in a Max heap

30
/ \
25 20
/ \ / \
22 18 17 16
/ \ / \ /\
21 13 15 5 2 1
Above is a Max-heap created following a sequence of inserting and removing operations.
If we assume that the last operation was an insertion. What will be the possible keys that could have been the last key inserted?
I'm really confused about how we can answer the question and the justification behind the solution.
If someone could give me an explanation of the solution, I would really appreciate it.
Thank you!
A heap has both the completeness and the heap property.
completeness property: All levels of the tree are full, except the last, which is filled from the left
heap property: every parent is greater in value than its children
An insert works by appending the new value to the bottom left, so we don't violate the completeness property.
Now we have to check if the value we just added violates the heap property (i.e. its parent is smaller than it). If so we swap them. We do this until we do not violate the heap property anymore or we have reached to root.
Given your example the following inserts could have happened:
Insert(1) - we are done, no heap violation, possible
Insert(17) - we insert 17, then swap with 1, but 1 is smaller than 2, so 1 could not be a parent, not possible
Insert(20) - we insert 20, swap with 1, then swap with 17, but the first swap means 1 was a parent of 2, so not possible
Insert(30) - you get the idea
Thus the answer is only Insert(1)
Hope this helps. Also, please have a look at the Wikipedia article: https://en.wikipedia.org/wiki/Heap_(data_structure)#Implementation

Why we build max heap from bottom up instead from top bottom

Question: Why do we want the loop index i in line 2 of BUILD-MAX-HEAP to decrease from ⌊length[A]/2⌋ to 1 rather than increase from 1 to ⌊length[A]/2⌋?
Algorithms: (Courtesy Introduction to Algorithms book):
I tried to show this using drawings as follows.
Approach 1: if we apply build heap the opposite way from 1 to ⌊length[A]/2⌋ of array A = <5,3,17,10,84>, we would have:
Approach 2: if we apply build heap the opposite way from ⌊length[A]/2⌋ and decrease to 1 of array A = <5,3,17,10,84>, we would have:
Problem: I see that in both cases the heap property that parent is larger than its children is maintained, so I don't see why a solution says that there would be a problem such that, "we won't be allowed to call MAX-HEAPIFY, since it will fail the condition of having the subtrees be max-heaps. That is, if we start with 1, there is no guarantee that A[2] and A[3] are roots of max-heaps."
The thing is that you can only rely on MAX_HEAPIFY to do its job right, when the subtree that is rooted at i obeys the heap property everywhere except possibly for the root value (at i) itself, which may need to sift down. The job of MAX_HEAPIFY is only to move the value of the root to its right position. It cannot fix any other violations of the heap property. If however, it is guaranteed that the rest of the tree below i is obeying the heap property, then you can be sure that the subtree at i will be a heap after MAX_HEAPIFY has run.
So if you would start with top node, then who knows what you will get... the rest of the tree is not expected to obey the heap property, so MAX_HEAPIFY will not (necessarily) deliver a heap. And it doesn't help to continue the work in a top-down fashion.
If we take the example tree and perform the forward loop alternative, then we start with a call of MAX_HEAPIFY(1) on this tree:
5
/ \
3 17
/ \
10 84
...then 5 would swap with 17 (at position 3), and then we would call MAX_HEAPIFY(3) recursively on that node, which would do nothing. So we would get:
17
/ \
3 5
/ \
10 84
Next, we call MAX_HEAPIFY(2) which will swap 3 with 84:
17
/ \
84 5
/ \
10 3
Again a recursive call follows on the node with value 3, but that will not do anything.
This was the last call of MAX_HEAPIFY in the forward loop alternative... and we can see that the value 84 had no chance to find its way all the way to the top where it belongs.

Given a list of keys, how do we find the almost complete binary search tree of that list?

I saw an answer here with the idea implemented in Python (not very familiar with Python) - I was looking for a more general algorithm.
EDIT:
For clarification:
Say we are given a list of integer keys: 23 44 88 12 74 32 7 39 10
That list was chosen arbitrarily. We are to create an almost complete (or complete) binary search tree from that list. There is supposed to be only one such tree...how do we find it?
A binary search tree is constructed so that all items on a node's left subtree are less than the node, and all nodes on the right subtree are greater than the node.
A complete (or almost complete) binary tree is one in which all levels except possibly the last are completely full, and the bottom level is filled to the left.
So, for example, this is an almost-complete binary search tree:
4
/ \
2 5
/ \
1 3
This is not:
3
/ \
2 4
/ \
1 5
Because the bottom level of the tree is not filled from the left.
If the number of items is one less than a power of two (i.e. 3, 7, 15, etc.), then building the tree is easy. Start by sorting the list. Then, take the middle element as the root. So if you have [1,2,3,4,5,6,7], and the root node is 4.
You do the same thing recursively for the right and left halves of the array.
If the number of items is not one less than a power of two, you have to adjust the starting point (the root node) so that the bottom row is left-filled. Note that you might have to apply that adjustment recursively, as well, whenever your subtree length is not one less than a power of two.
Since this is a homework assignment, I'll leave that for you to figure out.

Cache-aware tree impementation

I have a tree where every node may have 0 to N children.
Use-case is the following query: Given pointers to two nodes: Are these nodes within the same branch of the tree?
Examples
q(2,7) => true
q(5,4) => false
By the book (slow)
The straight forward implementation would be to store a pointer to the parent and a pointer to a list of children at each node. But this would lead to bad performance because the tree would be fragmented in memory and therefor not cache-aware.
Question
What would be a good way to represent the tree in compact form? The whole tree has about 100,000 nodes. So it should be possible to find a way to make it fit completely in the CPU-cache.
Binary trees for example are often represented implicitly as an array and are therefor perfect to be completely stored in the CPU-cache (if small enough).
You can pre-allocate a contiguous block of memory where you concatenate the information for all nodes.
Afterwards, each node would only need a way to retrieve the beginning of its information, and the length of that information.
In this case, the information for each node could be represented by the parent, followed by the list of children (let's assume that we use -1 when there is no parent, i.e. for the root).
For example, for the tree posted in the question, the information for node 1 would be: -1 2 3 4, the information for node 2 is: 1 5, and so on.
The contiguous array would be obtained by concatenating these arrays, resulting in something like:
-1 2 3 4 1 5 1 9 10 1 11 12 13 14 2 3 5 5 5 3 3 4 4 4 15 4
Each node would use some metadata to allow retrieving its associated information. As mentioned, this metadata would need to consist of a startIndex and length. E.g. for node 3, we would have startIndex = 6, length = 3, which allows to retrieve the 1 9 10 subarray, indicating that the parent is node 1, and its children are nodes 9 and 10.
In addition, the metadata information can also be stored in the contiguous memory block, at the beginning. The metadata has fixed length for each node (two values), thus we can easily obtain the position of the metadata for a certain node, given its index.
In this way, all the information about the graph will be stored in a contiguous, cache-friendly, memory block.

Which item goes up while inserting in a full node in a B-tree of order 5 and why?

I'm trying to learn designing a btree.
Here are the values to develop a btree of order 5.
1,12,8,2,25,6,14,28,17,7,52,16,48,68,3,26,29,53,55,45,67.
When I insert 25, it breaks into child nodes
8
/ \
1 2 12 25
may I I know on what basis 8 comes up as parent ? Why not any other number ? What if the order of btree would be 4 ?
In a B-tree of order 5 each node (except the root) must have 2 to 4 values in it.
At the point you enter 25, the node has the values 1,2,8,12. In order to have at least 2 values in each new child (1,2) and (12,25) you have to split at 8.
Before inserting 25, the node state is like this:
Full node: 1, 2, 8, 12. Items in node are always sorted in B-tree.
When you insert the new item 25, the sequence becomes: 1, 2, 8, 12, 25.
In this sequence the middle item is the one that is promoted up.
A node split divides the data items equally: Half go to the newly created node,
and half remain in the old one and the middle one goes up. This is the reason why 8 goes upward.
The following figures contain a B-tree of order 5 and should help understand this situation better although the data inserted is different. In the sequence on right-side, the arrow indicates the item to be promoted upward.

Resources