B-Tree deletion in a single pass - algorithm

Is it possible to remove an element from a B-Tree in a single pass?
Wikipedia says "Do a single pass down the tree, but before entering (visiting) a node, restructure the tree so that once the key to be deleted is encountered, it can be deleted without triggering the need for any further restructuring"
but doesn't say anything about how it is done.
Google only gives me the process of removing an element having to reestructure the tree.
Cormen also doesn't say anything about it.

It's possible in a variant of B+ tree called PO-B+ tree. In this "preparatory operations B+ tree" the number of keys in a node may be between n-1 and 2n+1 rather than n and 2n in the usual B+-tree (quoted from the paper). For delete operation (called PO-delete in the paper) you just merge (called "catenate" in the paper) all the nodes (except the root) that could be merged (or take a key from a neighbor), while moving toward the leaf. For PO-insert operation you split all the nodes (including the root). The description is given in the paper.
This preemptive restructuring only makes sense if the tree is used in multi-threaded environment, as it reduces the locking, and increases the concurency. It does not pay if a tree is accessed by only one actor.

Related

Implementing the Rope data structure using binary search trees (splay trees)

In a standard implementation of the Rope data structure using splay trees, the nodes would be ordered according to a rank statistic measuring the position of each one from the start of the string, so the keys normally found in binary search tree would be irrelevant, would they not?
I ask because the keys shown in the graphic below (thanks Wikipedia!) are letters, which would presumably become non-unique once the number of nodes exceeded the length of the chosen alphabet. Wouldn't it be better to use integers or avoid using keys altogether?
Separately, can anyone point me to a good implementation of the logic to recompute rank statistics after each operation?
Presumably, if the index for a split falls within the substring attached to a particular node, say, between "Hel" and "llo_" on the node E above, you would remove the substring from E, split it and reattach it as two children of E. Correct?
Finally, after a certain number of such operations, the tree could, I suppose, end up with as many leaves as letters. What would be the best way to keep track of that and prune the tree (by combining substrings) as necessary?
Thanks!
For what it's worth, you can implement a Rope using Splay Trees by attaching a substring to each node of the binary search tree (not just to the leaf nodes as shown above).
The rank of each node is its size plus the size of its left subtree. But when recomputing ranks during splay operations, you need to remember to walk down the node.left.right branch, too.
If each node records a reference to the substring it represents (cf. the actual substring itself), everything runs faster. That way when a split operation falls within an existing node, you just need to modify the node's attributes to reflect the right part of the substring you want to split, then add another node to represent the left part and merge it with the left subtree.
Done as above, each node records (in addition its left, right and parent attributes etc.) its rank, size (in characters) and the location of the first character it represents in the string you're trying to modify. That way, you never actually modify the initial string: you just do your operations on bits of the tree and reproduce the final string when you're ready by walking it in order.

Most performant way to find all the leaf nodes in a tree data structure

I have a tree data structure where each node can have any number of children, and the tree can be of any height. What is the optimal way to get all the leaf nodes in the tree? Is it possible to do better than just traversing every path in the tree until I hit the leaf nodes?
In practice the tree will usually have a max depth of 5 or so, and each node in the tree will have around 10 children.
I'm open to other types of data structures or special trees that would make getting the leaf nodes especially optimal.
I'm using javascript but really just looking for general recommendations, any language etc.
Thanks!
Memory layout is essential to optimal retrieval, so the child lists should be contiguous and not linked list, the nodes should be place after each other in retrieval order.
The more static your tree is, the better layout can be done.
All in one layout
All in one array totally ordered
Pro
memory can be streamed for maximal throughput (hardware pre-fetch)
no unneeded page lookups
normal lookups can be made
no extra memory to make linked lists.
internal nodes use offset to find the child relative to itself
Con
inserting / deleting can be cumbersome
insert / delete O(N)
insert might lead to resize of the array leading to a costly copy
Two array layout
One array for internal nodes
One array for leafs
Internal nodes points to the leafs
Pro
leaf nodes can be streamed at maximum throughput (maybe the best layout if your mostly only interested in the leafs).
no unneeded page lookups
indirect lookups can be made
Con
if all leafs are ordered insert / delete can be cumbersome
if leafs are unordered insertion is ease, just add at the end.
deleting unordered leafs is also a problem if no tombstones are allowed as the last leaf would have to be moved back and the internal nodes would need fix up. (via a further indirection this can also be fixed see slot-map)
resizing of the either might lead to a large copy, though less than the All-in-one as they could be done independently.
Array of arrays (dynamic sized, C++ vector of vectors)
using contiguous arrays for referencing the children of each node
Pro
running through each child list is fast
each child array may be resized independently
Con
while removing much of the extra work of linked list children the individual lists are dispersed among all other data making lookup taking extra time.
insert might cause resize and copy of an array.
Finding the leaves of a tree is O(n), which is optimal for a tree, because you have to look at O(n) places to retrieve all n things, plus the branch nodes along the way. The constant overhead is the branch nodes.
If we increase the branching factor, e.g. letting each branch have 32 children instead of 2, we significantly decrease the number of overhead nodes, which might make the traversal faster.
If we skip a branch, we're not including the values in that branch, so we have to look at all branches.

Deleting a node in a BK Tree

I have seen many different implementations of BK Trees in many different languages, and literally none of them seem to include a way to remove nodes from the tree.
Even the original article where BK Trees were first introduced does not provide a meaningful insight about node deletion, as the authors merely suggest to mark the node to be deleted so that it is ignored:
The deletion of a key in Structures 1 [the BK Tree] and 2 follows a process similar to that above, with special consideration for the case in which the key to be deleted is the representative x° [root key]. In this case, the key cannot simply be deleted, as it is essential for the structure information. Instead an extra bit must be used for each key which denotes whether the key actually corresponds to a record or not. The search algorithm is modified correspondingly to ignore keys which do not correspond to records. This involves testing the extra bit in the Update procedure.
While it may be theoretically possible to properly delete a node in a BK Tree, is it possible to do so in linear/sublinear time?
While it may be theoretically possible to properly delete a node in a BK Tree, is it possible to do so in linear/sublinear time?
If you want to physically remove it from a BK-Tree, then I can't think of a way to do this in a linear time for all cases. Consider 2 scenarios, when a node is removed. Please note that I do not account for a time complexity related to calculating the Levenshtein distance because that operation doesn't depend on the number of words, although it requires some processing time too.
Remove non-root node
Find a parent of the node in the tree.
Save node's child nodes.
Nullify parent's reference to the node.
Re-add each child node as if it were a new node.
Here, even if step 1 can be done in O(1), steps 2 and 4 are way more expensive. Inserting a single node is O(h), where h is a height of tree. To make matters worse, this has to be done for each child node of the original node, and so it will be O(k*h), where k is a number of child nodes.
Remove root node
Rebuild the tree from scratch without using the previous root node.
Rebuilding a tree will be at least O(n) in the best case and O(h*n) otherwise.
Alternative solution
That's why it's better not to delete a node physically, but keep it in a tree and just mark it as deleted. This way it will be used, as before, for inserting new nodes, but will be excluded from suggestion results for a misspelled word. This can be done in O(1).

HRW rendezvous hashing in log time?

The Wikipedia page for Rendezvous hashing (Highest Random Weight "HRW") makes the following claim:
While it might first appear that the HRW algorithm runs in O(n) time, this is not the case. The sites can be organized hierarchically, and HRW applied at each level as one descends the hierarchy, leading to O(log n) running time, as in.[7]
I got a copy of the referenced paper, "Hash-Based Virtual Hierarchies for Scalable Location Service in Mobile Ad-hoc Networks." However the hierarchy referenced in their paper seems to be very specific to their application domain. As far as I can discern, there is no clear indication of how to generalize the method. The Wikipedia remark makes it seem like log is the general case.
I looked at a few general HRW implementations, and none of them seemed to support anything better than linear time. I gave it some thought, but I don't see any way to organize sites hierarchically without causing parent nodes to cause inefficient remapping when they drop out, significantly defeating the main advantage of HRW.
Does anybody know how to do this? Alternatively, is Wikipedia incorrect about there being a general way to implement this in log time?
Edit: Investigating mcdowella's approach:
OK, I think I see how this could work. But you need a little more than you've specified.
If you just do what you've described, you get in a situation where each leaf probably just has either zero or one nodes in it, and there's significant variance in how many nodes are in the leaf-most subtrees. If you swap using HRW at each level with just making the whole thing a regular search tree, you get exactly the same effect. Essentially, you've got an implementation of consistent hashing, along with its flaw of having unequal loading between buckets. Computing the combined weights, the defining implementation of HRW, adds nothing; you're better off just doing a search at each level, since it saves doing the hashes, and can be implemented without looping over each radix value
It's fixable though: you just need to be using HRW to choose from many alternatives at the final level. That is, you need all of the leaf nodes to be in large buckets, comparable to the number of replicas you'd have in consistent hashing. These large buckets should be approximately equally-loaded compared to each other, and then you're using HRW to choose the specific site. Since the bucket sizes are fixed, this is an O(n) algorithm, and we get all of the key HRW properties.
Honestly though, I think this is pretty questionable. It isn't so much an implementation of HRW, as it is just combining HRW with consistent hashing. I guess there's nothing wrong with that, and it might even be better than the usual technique of using replicas, in some cases. But I think it's misleading to state that HRW is log(n), if this is actually what the author meant.
Additionally, the original description is also questionable. You don't need to apply HRW at each level, and you shouldn't, as there is no advantage in doing so; you should do something fast (such as indexing), and just use HRW for the final choice.
Is this really the best we can do, or is there some other way to make HRW O(log(n))?
If you give each site a sufficiently long random id expressed in radix k (perhaps by hashing a non-random id) then you can associate the sites with leaves of a tree which has at most k descendants at each node. There is no need to associate any site with an internal node of the tree.
To work out where to store an item, use HRW to work out from the root of the tree down which way to branch at tree nodes, stopping when you reach a leaf, which is associated with a site. You can do this without having to communicate with any site until you work out which site you want to store the item at - all you need to know is the hashed ids of the sites to construct a tree.
Because sites are associated only with leaves there is no way an internal node of the tree can drop out, except if all of the sites associated with leaves under it drop out, at which point it will become irrelevant.
I don't buy the updated answer. There are two nice properties of HRWs that appear to get lost when you compare the weights of branches instead of all sites.
One is that you can pick the top-n sites instead of just the primary, and these should be randomly distributed. If you're descending into a single tree, the top-n sites will be near each other in the tree. This could be fixed by descending multiple times with different salts but that seems like a lot of extra work.
Two is that it is obvious what happens when a site is added or remove and only 1/|sites| of the data moves in the case of an add. If you modify the existing tree, it only affects the peer site. In the case of an add, the only data that moves is from the new peer of the added site. In the case of a delete, all the data that was at that site now moves to the former peer. If you instead recompute the tree, all of the data could move depending on the way the tree is constructed.
I think you can use the same "virtual node" approach normally used for consistent hashing. Suppose you have N physical nodes with IDs:
{n1,...,nN}.
Choose V, the number of virtual nodes per physical node, and generate a new list of IDs:
{n1v1,v1v2,...,n1vV
,n2v1,n2v2,...,n2vV
,...
,nNv1,nNv2,...,nNvV}.
Arrange these into the leaves of a fixed but randomized binary tree with labels on the internal nodes. These internal labels could be, for example, a concatenation of the labels of its child nodes.
To choose a physical node to store an object O at, start at the root and choose the branch with the higher hash H(label,O). Repeat the process until you reach a leaf. Store the object at the physical node corresponding to the virtual node at that leaf. This takes O(log(NV)) = O(log(N)+log(V)) = O(log(N)) steps (since V is constant).
If a physical node fails, the objects at that node are rehashed, skipping over subtrees with no active leaves.
One way to implement HRW rendezvous hashing in log time
One way to implement rendezvous hashing in O(log N), where N is the number of cache nodes:
Each file named F is cached in the cache node named C with the largest weight w(F,C), as is normal in rendezvous hashing.
First, we use a nonstandard hash function w() something like this:
w(F,C) = h(F) xor h(C).
where h() is some good hash function.
tree construction
Given some file named F, rather than calculate w(F,C) for every cache node -- which requires O(N) time for each file --
we pre-calculate a binary tree based only on the hashed names h(C) of the cache nodes;
a tree that lets us find the cache node with the maximum w(F,C) value in O(log N) time for each file.
Each leaf of the tree contains the name C of one cache node.
The root (at depth 0) of the tree points to 2 subtrees.
All the leaves where the most significant bit of h(C) is 0 are in the root's left subtree; all the leaves where the most significant bit of h(C) are 1 are in the root's right subtree.
The two children of the root node (at depth 1) deal with the next-most-significant bit of h(C).
And so on, with the interior nodes at depth D dealing with the D'th-most-significant bit of h(C).
With a good hash function, each step down from the root approximately halves the candidate cache nodes in the chosen subtree,
so we end up with a tree of depth roughly ln_2 N.
(If we end up with a tree with that is "too unbalanced",
somehow get everyone to agree on some different hash function from some universal hashing family rebuild the tree, before we add any files to the cache, until we get a tree that is "not too unbalanced").
Once the tree has been built, we never need to change it no matter how many file names F we later encounter.
We only change it when we add or remove cache nodes from the system.
filename lookup
For a filename F that happens to hash to h(F) = 0 (all zero bits),
we find the cache node with the highest weight (for that filename) by starting at the root and always taking the right subtree when possible.
If that leads us to an interior node that doesn't have a right subtree, then we take its left subtree.
Continue until we reach a node without a left or right subtree -- i.e., a leaf node that contains the name of the selected cache node C.
When looking up some other file named F, first we hash its name to get h(F), then
we start at the root and go right or left respectively (if possible) determined by the next bit in h(F) is 0 or 1.
Since the tree (by construction) is not "too unbalanced",
traversing the whole tree from the root to the leaf that contains the name of the chosen cache node C requires O(ln N) time in the worst case.
We expect that for a typical set of file names,
the hash function h(F) "randomly" chooses left or right at each depth of the tree.
Since the tree (by construction) is not "too unbalanced",
we expect each physical cache node to cache roughly the same number of files (within a multiple of 4 or so).
drop out effects
When some physical cache node fails,
everyone deletes the corresponding leaf node from their copy of this tree.
(Everyone also deletes every interior node that then has no leaf descendants).
This doesn't require moving around any files cached on any other cache node -- they still map to the same cache node they always did.
(The right-most leaf node in a tree is still the right-most leaf node in that tree, no matter how many other nodes in that tree are deleted).
For example,
....
\
|
/ \
| |
/ / \
| X |
/ \ / \
V W Y Z
With this O(log N) algorithm, when cache node X dies, leaf X is deleted from the tree, and all its files become (hopefully relatively evenly) distributed between Y and Z -- none of the files from X end up at V or W or any other cache node.
All the files that previously went to cache nodes V, W, Y, Z continue to go to those same cache nodes.
rebalancing after dropout
Many cache nodes failing or new cache nodes adding or both, may make the tree "too unbalanced".
Picking a new hash function is a big hassle after we've added a bunch of files to the cache, so rather than pick a new hash function like we did when initially constructing the tree, maybe it would be better to somehow rebalance the tree by remove a few nodes, rename them with some new semi-random names, and then add them back to the system.
Repeat until the system is no longer "too unbalanced".
(Start with the most unbalanced nodes -- the nodes cacheing the least amount of data).
comments
p.s.:
I think this may be pretty close to what mcdowella was thinking,
but with more details filled in to clarify that (a) yes, it is log(N) because it's a binary tree that is "not too unbalanced", (b) it doesn't have "replicas", and (c) when one cache node fails, it doesn't require any remapping of files that were not on that cache node.
p.p.s.:
I'm pretty sure that Wikipedia page is wrong to imply that typical implementations of rendezvous hashing occur in O(log N) time, where N is the number of cache nodes.
It seems to me (and I suspect the original designers of the hash as well) that the time it takes to (internally, without communicating) recalculate a hash against every node in the network is going to be insignificant and not worth worrying about compared to the time it takes to fetch data from some remote cache node.
My understanding is that rendezvous hashing is almost always implemented with a simple linear algorithm that uses O(N) time, where N is the number of cache nodes, every time we get a new filename F and want to choose the cache node for that file.
Such a linear algorithm has the advantage that it can use a "better" hash function than the above xor-based w(), so when some physical cache node dies, all the files that were cached on the now-dead node are expected to become evenly distributed among all the remaining nodes.

Disadvantages of top-down node splitting on insertion into B+ tree

For a B+ tree insertion why would you traverse down the tree then back upwards splitting the parents?
Wikipedia suggests this method of insertion:
Perform a search to determine what bucket the new record should go
into.
If the bucket is not full (at most b - 1 entries after the insertion), add the record.
Otherwise, split the bucket.
Allocate new leaf and move half the bucket's elements to the new bucket.
Insert the new leaf's
smallest key and address into the parent.
If the parent is full, split it too.
Add the middle key to the parent node.
Repeat until a parent is found that need not split.
If the root splits, create a new root which has one key and two
pointers.
Why would you traverse down then tree and then go back up performing the splits? Why not split the nodes as you encounter them on the way down?
To me, the proposed method performs twice the work and requires more bookkeeping as well.
Can anyone explain why this is the preferred method for insertion as opposed to splitting on the way down and what the disadvantages are for inserting during the traversal?
You have to backtrack up the tree because you don't actually know whether a split is required at the lowest level until you get there.
It's all there in the phrase "If the bucket is not full, ...".
You should also be aware that it's nowhere near twice the work. Since you're remembering all sorts of stuff on the way down (node pointers, indexes within the node, and so on), there's not as much calculation or searching on the way back up.

Resources