I have come across this question and haven't been able to answer it.
Given a B-tree of order 9 and of 4 levels will insertion and right after it removal of a new item x will always bring the tree to its first structure?
Will removal and insertion of a existing item x always bring the tree to its first structure?
Prove it.
So far i tried to disprove it but haven't been able to.
Now i honestly can't find the answer, I am not asking for a full proof a general idea on how to prove it will satisfy me.
The answer obviously depends on the implementation of the insert and delete methods but in short: no.
I won't give you a full proof (because you didn't ask for it and because I'm too lazy) but the general idea should be that usually when you delete a node the inner-most node of the opposite side (relative to the parent) takes its place. So in any scenario where that node exists, it will be moved up. It also means that the node was not a leaf, which is a problem because insertion usually puts new nodes on the tree as a leaf. So the original structure will only be maintained if the inner-most node of the opposite side (relative to the parent) is empty.
This is the deletion I'm referring to. If you remove 2 and re-insert it, that's the counter proof.
Related
Well, I'm studying for a test and I'm a little bit confused with the following.
The following image is a B-tree with t=3 so each node can have at most 2t-1 keys and at least t-1 keys.
I'm being asked to delete key=3.
I can't understand why I need to join the root with its sons in this case. I know the delete algorithm is defensive as it starts in the root and checks every node so it will not need to go to any ancestor again.
But which rule will be broken if I don't join the root with its son?
Original B-tree
After deleting key 3
As for me I would just delete key 3 and that's it.
It would not broke any of the rules, the algorithm just executes every possible node merge while looking up the given key. This is necessary to ensure that there will be no need to traverse the tree upwards after the deletion.
Also, the height of the tree is reduced, which will speed up later lookups.
So this behaviour is an algorithmic decision to implement the B-tree efficiently.
Consider the following 2-3-4 tree (i.e., B-tree with a minimum degree of two) in
which each data item is a letter. The usual alphabetical ordering of letters is used
in constructing the tree.
What is the result of inserting G in the above tree?
I am getting the answer as
But the answer in solution key is
Can anyone explain how to get the answer provided by the solution key?
As long the invariants are not violated, the operation is technically valid. The insertion algorithm in CLRS splits on the way down, so it would split the root like you did.
However, another implementation might observe that the second child is empty and the first is full. That means the "rotation" can be done and the root node count is unaffected. The rotation involves pushing L down into the second child (prepending) and pulling up I up into L's previous place in the root. Now the first child has only two entries and you can insert into it.
Animated insertion using the CLRS method you used
The idea of deleting a node in BST is:
If the node has no child, delete it and update the parent's pointer to this node as null
If the node has one child, replace the node with its children by updating the node's parent's pointer to its child
If the node has two children, find the predecessor of the node and replace it with its predecessor, also update the predecessor's parent's pointer by pointing it to its only child (which only can be a left child)
the last case can also be done with use of a successor instead of predecessor!
It's said that if we use predecessor in some cases and successor in some other cases (giving them equal priority) we can have better empirical performance ,
Now the question is , how is it done ? based on what strategy? and how does it affect the performance ? (I guess by performance they mean time complexity)
What I think is that we have to choose predecessor or successor to have a more balanced tree ! but I don't know how to choose which one to use !
One solution is to randomly choose one of them (fair randomness) but isn't better to have the strategy based on the tree structure ? but the question is WHEN to choose WHICH ?
The thing is that is fundamental problem - to find correct removal algorithm for BST. For 50 years people were trying to solve it (just like in-place merge) and they didn't find anything better then just usual algorithm (with predecessor/successor removing). So, what is wrong with classic algorithm? Actually, this removing unbalances the tree. After several random operations add/remove you'll get unbalanced tree with height sqrt(n). And it is no matter what you choosed - remove successor or predecessor (or random chose beetwen these ways) - the result is the same.
So, what to choose? I'm guessing random based (succ or pred) deletion will postpone unbalancing of your tree. But, if you want to have perfectly balanced tree - you have to use red-black ones or something like that.
As you said, it's a question of balance, so in general the method that disturbs the balance the least is preferable. You can hold some metrics to measure the level of balance (e.g., difference from maximal and minimal leaf height, average height etc.), but I'm not sure whether the overhead worth it. Also, there are self-balancing data structures (red-black, AVL trees etc.) that mitigate this problem by rebalancing after each deletion. If you want to use the basic BST, I suppose the best strategy without apriori knowledge of tree structure and the deletion sequence would be to toggle between the 2 methods for each deletion.
I'm implementing a level order succint trie and I wan't to be able for a given node to jump back to his parent.
I tried several combination of rank/level but I can't wrap my head around this one...
I'm using this article as a base documentation :
http://stevehanov.ca/blog/index.php?id=120
It explain how to traverse childs, but not how to go up.
Thanks to this MIT lecture (http://www.youtube.com/watch?v=1MVVvNRMXoU) I know this is possible (in constant time as stated at 15:50), but the speaker only explain it for binary trie (eg: using the formula select1(floor(i/2)) ).
How can I do that on a k-ary trie?
Well, I don't know what select1() is, but the other part (floor(i/2)) looks like the trick you would use in an array-embedded binary tree, like those described here. You would divide by 2 because every parent has exactly 2 children --> every level uses twice the space of the parent level.
If you don't have the same number of children in every node (excepting leafs and perhaps one node with less children), you can't use this trick.
If you want to know the parent of any given node, you will need to add a pointer to the parent in every node.
Though, since trees are generally traversed starting at the root and going down, the usual thing to do is to store, in an array, the pointers to the nodes of the path. At any given point, the parent of the current node is the previous element in the array. This way you don't need to add a pointer to the parent in every node.
I think I've found my answer. This paper of Guy Jacobson explains it in section 3.2 Level-order unary degree sequence.
parent(x){ select1(rank0(x)) }
Space-efficient Static Trees and Graphs
http://www.cs.cmu.edu/afs/cs/project/aladdin/wwwlocal/compression/00063533.pdf
This work pretty good, as long as you don't mess up your node numbering like I was.
I'm trying to learn about b-tree and every source I can find seem to omits the discussion about how to remove an element from the tree while preserving the b-tree properties.
Can someone explain the algorithm or point me to resource that do explain how it's done?
There's an explanation of it on the Wikipedia page. B-tree - Deletion
If you haven't got it yet, I strongly recommend Carmen & al Introduction to Algorithms 3rd Edition.
It is not described because the operations naturally stem from the B-Tree properties.
Since you have a lower-bound on the number of elements in a node, if removing your elements violates this invariant, then you need to restore it, which generally involves merging with a neighbour (or stealing some of its elements).
If you merge with a neighbour, then you need to remove an element in the parent node, which triggers the same algorithm. And you apply recursively till you get to the top.
B-Tree don't have rebalancing (at least not those I saw) so it's far less complicated that maintaining a red-black tree or an AVL tree which is probably why people didn't feel compelled to write about the removal.
About which b-trees are you talking about? With linked leaves or not? Also, there are different ways of removing an item (top-bottom, bottom-top, etc.). This paper might help: B-trees, Shadowing, and Clones (even though there are many file-system specific related stuff).
The deletion example from CLRS (2nd edition) is available here: http://ysangkok.github.io/js-clrs-btree/btree.html
Press "Init book" and then push the deletion buttons in order. That will cover all cases. Try and predict the new tree state before pushing each button, and try to recognize how the cases are all unique.