B+ tree deletion - data-structures

So I have this B+ tree:
I have to delete 49 here. How do I go about it?
Will it be like this :
Or like this? :

After deleting 49, merge 48 into its neighbour node that is containing 32 and 40.We end up with leaf node (32, 40, 48).
After merging, update its parent node from (45) to (32).
Since the node (32) only has 1 pointer, it can be eliminated.
Hence, we end up with root node (32)--->(32,40,48) on the right sub-tree of the root.
In-depth explanation:
There are 3 cases for deleting a key. After deleting a key for node,
If node still contains more than floor((n+1)/2) keys:
Simply delete the key. No updates needed.
If node contains less than floor((n+1)/2) keys:
If neighbour leaf nodes have key to spare (i.e. giving out a key still results in itself having more than floor((n+1)/2) keys), borrow a key from neighbour.
If neighbours unable to lend keys, merge current node with a neighbour node.
** For cases 2 and 3, always remember to update parent nodes after borrowing or merging.

Related

How many keys are there in leaf node of B+ trees?

I am going through B+ trees. I am having a doubt that what is the minimum number of keys in a B+ tree leaf node ?
I am reading from the following references, but they are saying something else
1). https://en.wikipedia.org/wiki/B%2B_tree
2). http://www.cburch.com/cs/340/reading/btree/
3). http://courses.cs.washington.edu/courses/cse326/08sp/lectures/11-b-trees.pdf
The last link uses some letter L for leaf node.
Can someone clarify, what must be the exact coount of keys in a leaf node ?
Min: 1, because a tree with just one entry must be possible.
Max: <branch factor> - 1, because that's how the branch factor is defined.

Deleting nodes from a binary search tree

I understand the idea when deleting a node that has two subtrees: I "erase" the node's value and replace it with either its predecessor from the left subtree's value or its successor from the right subtree's value, and then delete that node.
However, does it matter if I choose the successor from the right subtree or the predecessor from the left subtree? Or is either way valid as long as I still have a binary search tree after performing the deletion?
Both ways to perform a delete operation are valid if the node has two children.
Remember that when you get either the in-order predecessor node or the in-order successor node, you must call the delete operation on that node.
It doesn't matter which one you choose to replace. In fact you may need both.
Look at the following BST.
7
/ \
4 10
/ \ /
1 5 8
\
3
To delete 1, you need to replace 1 with right node 3.
And to delete 10, you need to replace 10 with left node 8.

Binary Search Trees-Deletion

In the case of Binary Search Trees why cannot we simply put the predecessor in place of the successor of a node in deletion case where a node is having two children?
We'd like to delete such a node with minimum amount of work and disruption to the structure of the tree.
Suppose we want to delete the node containing 6 from the following tree:
The standard solution is based on this idea: we leave the node containing 6 exactly where it is, but we get rid of the value 6 and find another value to store in the 6 node. This value is taken from a node below the 6s node, and it is that node that is actually removed from the tree.
Now, what value can we move into the vacated node and have a binary search tree? Well, here's how to figure it out. If we choose value X, then:
everything in the left subtree must be smaller than X.
everything in the right subtree must be bigger than X.
Let's suppose we're going to get X from the left subtree. (2) is guaranteed because everything in the left subtree is smaller than everything in the right subtree. What about (1)? If X is coming from the left subtree, (1) says that there is a unique choice for X - we must choose X to be the largest value in the left subtree. In our example, 3 is the largest value in the left subtree. So if we put 3 in the vacated node and delete it from its current position we will have a BST with 6 deleted.
The result is :
why cannot we simply put the predecessor in place of the successor of a node in deletion
case where a node is having two children?
We can put both and it is not necessary to replace the deleted node with the inorder successor. This is because in either case, the general contract of a BST is maintained.
Case1. Replace the deleted node with the inorder successor.
This is done by finding the leftmost node in the deleted node's right subtree.
Case2. Replace the deleted node with the inorder predecessor.
This is done by finding the rightmost node in the deleted node's left subtree.
Note that both these cases will keep all the elements in the left subtree smaller and all elements in right subtree greater than the element that we have brought into the position of the deleted node.

Deletion procedure for a Binary Search Tree

Consider the deletion procedure on a BST, when the node to delete has two children. Let's say i always replace it with the node holding the minimum key in its right subtree.
The question is: is this procedure commutative? That is, deleting x and then y has the same result than deleting first y and then x?
I think the answer is no, but i can't find a counterexample, nor figure out any valid reasoning.
EDIT:
Maybe i've got to be clearer.
Consider the transplant(node x, node y) procedure: it replace x with y (and its subtree).
So, if i want to delete a node (say x) which has two children i replace it with the node holding the minimum key in its right subtree:
y = minimum(x.right)
transplant(y, y.right) // extracts the minimum (it doesn't have left child)
y.right = x.right
y.left = x.left
transplant(x,y)
The question was how to prove the procedure above is not commutative.
Deletion (in general) is not commutative. Here is a counterexample:
4
/ \
3 7
/
6
What if we delete 4 and then 3?
When we delete 4, we get 6 as the new root:
6
/ \
3 7
Deleting 3 doesn't change the tree, but gives us this:
6
\
7
What if we delete 3 and then 4?
When we delete 3 the tree doesn't change:
4
\
7
/
6
However, when we now delete 4, the new root becomes 7:
7
/
6
The two resulting trees are not the same, therefore deletion is not commutative.
UPDATE
I didn't read the restriction that this is when you always delete a node with 2 children. My solution is for the general case. I'll update it if/when I can find a counter-example.
ANOTHER UPDATE
I don't have concrete proof, but I'm going to hazard a guess:
In the general case, you handle deletions differently based on whether you have two children, one child, or no children. In the counter-example I provided, I first delete a node with two children and then a node with one child. After that, I delete a node with no children and then another node with one child.
In the special case of only deleting nodes with two children, you want to consider the case where both nodes are in the same sub-tree (since it wouldn't matter if they are in different sub-trees; you can be sure that the overall structure won't change based on the order of deletion). What you really need to prove is whether the order of deletion of nodes in the same sub-tree, where each node has two children, matters.
Consider two nodes A and B where A is an ancestor of B. Then you can further refine the question to be:
Is deletion commutative when you are considering the deletion of two nodes from a Binary Search Tree which have a ancestor-descendant relationship to each other (this would imply that they are in the same sub-tree)?
When you delete a node (let's say A), you traverse the right sub-tree to find the minimum element. This node will be a leaf node and can never be equal to B (because B has two children and cannot be a leaf node). You would then replace the value of A with the value of this leaf-node. What this means is that the only structural change to the tree is the replacement of A's value with the value of the leaf-node, and the loss of the leaf-node.
The same process is involved for B. That is, you replace the value of the node and replace a leaf-node. So in general, when you delete a node with two children, the only structural change is the change in value of the node you are deleting, and the deletion of the leaf node who's value you are using as replacement.
So the question is further refined:
Can you guarantee that you will always get the same replacement node regardless of the order of deletion (when you are always deleting a node with two children)?
The answer (I think) is yes. Why? Here are a few observations:
Let's say you delete the descendant node first and the ancestor node second. The sub-tree that was modified when you deleted the descendant node is not in the left sub-tree of the ancestor node's right child. This means that this sub-tree remains unaffected. What this also means is regardless of the order of deletion, two different sub-trees are modified and therefore the operation is commutative.
Again, let's say you delete the descendant node first and the ancestor node second. The sub-tree that was modified when you deleted the descendant node is in the left sub-tree of the ancestor node's right child. But even here, there is no overlap. The reason is when you delete the descendant node first, you look at the left sub-tree of the descendant node's right child. When you then delete the ancestor node, you will never go down that sub-tree since you will always be going towards the left after you enter the ancestor node's right-child's left sub-tree. So again, regardless of what you delete first you are modifying different sub-trees and so it appears order doesn't matter.
Another case is if you delete the ancestor node first and you find that the minimum node is a child of the descendant node. This means that the descendant node will end up with one child, and deleting the one child is trivial. Now consider the case where in this scenario, you deleted the descendant node first. Then you would replace the value of the descendant node with its right child and then delete the right child. Then when you delete the ancestor node, you end up finding the same minimum node (the old deleted node's left child, which is also the replaced node's left child). Either way, you end up with the same structure.
This is not a rigorous proof; these are just some observations I've made. By all means, feel free to poke holes!
It seems to me that the counterexample shown in Vivin's answer is the sole case of non-commutativity, and that it is indeed eliminated by the restriction that only nodes with two children can be deleted.
But it can also be eliminated if we discard what appears to be one of Vivin's premises, which is that we should traverse the right subtree as little as possible to find any acceptable successor. If, instead, we always promote the smallest node in the right subtree as the successor, regardless of how far away it turns out to be located, then even if we relax the restriction on deleting nodes with fewer than two children, Vivin's result
7
/
6
is never reached if we start at
4
/ \
3 7
/
6
Instead, we would first delete 3 (without successor) and then delete 4 (with 6 as successor), yielding
6
\
7
which is the same as if the order of deletion were reversed.
Deletion would then be commutative, and I think it is always commutative, given the premise I have named (successor is always smallest node in right subtree of deleted node).
I do not have a formal proof to offer, merely an enumeration of cases:
If the two nodes to be deleted are in different subtrees, then deletion of one does not affect the other. Only when they are in the same path can the order of deletion possibly affect the outcome.
So any effect on commutativity can come only when an ancestor node and one of its descendants are both deleted. Now, how does their vertical relationship affect commutativity?
Descendant in the left subtree of the ancestor. This situation will not affect commutativity because the successor comes from the right subtree and cannot affect the left subtree at all.
Descendant in the right subtree of the ancestor. If the ancestor's successor is always the smallest node in the right subtree, then order of deletion cannot change the choice of successor, no matter what descendant is deleted before or after the ancestor. Even if the successor to the ancestor turns out to be the descendant node that is also to be deleted, that descendant too is replaced with the the next-largest node to it, and that descendant cannot have its own left subtree remaining to be dealt with. So deletion of an ancestor and any right-subtree descendant will always be commutative.
I think there are two equally viable ways to delete a node, when it has 2 children: SKIP TO CASE 4...
Case 1: delete 3 (Leaf node)
2 3
/ \ --> / \
1 3 1
Case 2: delete 2 (Left child node)
2 3
/ \ --> / \
1 3 1
Case 3: delete 2 (Right child node)
2 2
/ \ --> / \
1 3 3
______________________________________________________________________
Case 4: delete 2 (Left & Right child nodes)
2 2 3
/ \ --> / \ or / \
1 3 1 3
BOTH WORK and have different resulting trees :)
______________________________________________________________________
As algorithm explained here: http://www.mathcs.emory.edu/~cheung/Courses/323/Syllabus/Trees/AVL-delete.html
Deleting a node with 2 children nodes:
1) Replace the (to-delete) node with its in-order predecessor or in-order successor
2) Then delete the in-order predecessor or in-order successor
I respond here to Vivin's second update.
I think this is a good recast of the question:
Is deletion commutative when you are
considering the deletion of two nodes
from a Binary Search Tree which have a
ancestor-descendant relationship to
each other (this would imply that they
are in the same sub-tree)?
but this bold sentence below is not true:
When you delete a node (let's say A),
you traverse the right sub-tree to
find the minimum element. This node
will be a leaf node and can never be equal to B
since the minimum element in A's right subtree can have a right child. So, it is not a leaf.
Let's call the minimum element in A's right subtree successor(A).
Now, it is true that B cannot be successor(A), but it can be in its right subtree. So, it is a mess.
I try to summarize.
Hypothesis:
A and B have two children each.
A and B are in the same subtree.
Other stuff we can deduce from hypothesis:
B is not successor(A), neither A is successor(B).
Now, given that, i think there are 4 different cases (as usual, let be A an ancestor of B):
B is in A's left subtree
B is an ancestor of successor(A)
successor(A) is an ancestor of B
B and successor(A) don't have any relationship. (they are in different A's subtrees)
I think (but of course i cannot prove it) that cases 1, 2 and 4 don't matter.
So, only in the case successor(A) is an ancestor of B deletion procedure could not be commutative. Or could it?
I pass the ball : )
Regards.

In B-trees which element gets promoted when the node splits

Let's say there is a B-tree of order 8. This means it can have 8 pointers and 7 elements. Say the letters A through G are stored in this B-tree. So this B-tree is just a single node containing 7 elements.
Then you try to insert J into the tree. There's no room, so you have to split the node and create a new root node. Which element gets promoted up into the root node?
When you want to insert a new element in a full node (with 2*t - 1 keys)
you split it by choosing the median key of the node (the key that is the middle)
you generate the two new children with t-1 keys each (splitting it according to the previous key)
the median value remains in the father node
then you proceed by normal insertion algorithm, looking where you should place the new element.

Resources