kd-tree: duplicate key and deletion

kd-tree: duplicate key and deletion - algorithm

In these slides (13) the deletion of a point in a kd-tree is described: It states that the left subtree can be swapped to be the right subtree, if the deleted node has only a left subtree. Then the minimum can be found and recursively be deleted (just as with a right subtree).
This is because kd-trees with equal keys for the current dimensions should be on the right.
My question: Why does the equal key point have to be the right children of the parent point? Also, what happens if my kd-tree algorithm returns a tree with an equal key point on the left?
For example:
Assume the dataset (7,2), (7,4), (9,6)
The resulting kd-tree would be (sorted with respect to one axis):
(7,2)
/ \
(7,4) (9,6)
Another source that states the same theory is this one (paragraph above Example 15.4.3)
Note that we can replace the node to be deleted with the least-valued node from the right subtree only if the right subtree exists. If it does not, then a suitable replacement must be found in the left subtree. Unfortunately, it is not satisfactory to replace N's record with the record having the greatest value for the discriminator in the left subtree, because this new value might be duplicated. If so, then we would have equal values for the discriminator in N's left subtree, which violates the ordering rules for the kd tree. Fortunately, there is a simple solution to the problem. We first move the left subtree of node N to become the right subtree (i.e., we simply swap the values of N's left and right child pointers). At this point, we proceed with the normal deletion process, replacing the record of N to be deleted with the record containing the least value of the discriminator from what is now N's right subtree.
Both refer to nodes that only have a left subtree but why would this be any different?
Thanks!

There is no hard and fast rule to have equal keys on right only. You can update that to left as well.
But doing this, you would also need update your algorithms of search and delete operations.
Have a look at these links:
https://www.geeksforgeeks.org/k-dimensional-tree/
https://www.geeksforgeeks.org/k-dimensional-tree-set-3-delete/

Related

Binary Search Tree Node Removal: How to decide which subtree to traverse for removal? (If node has two children)

I am seeking help in understanding Binary search tree removals when a node has two children.
What I know is that when a BST node to be removed has two children, one can find either the smallest value starting from the right subtree or the largest value from the left subtree.
Which subtree should I traverse by default- should I use the right or left subtree ? Under what conditions should I pick the left/right subtree? How much does this choice matter?
Please bear with me, as I'm a newbie to DS and algos.

If the tree is more or less balanced, it won’t matter which subtree you traverse to find the replacement.
Otherwise, if the tree is left heavy then going to right sub-tree and picking the smallest value could be quicker. Vice-versa if the tree is right heavy.
Also, whether the BST has only unique values or whether the tree has duplicates, picking the root replacement from left subtree or right subtree will not affect the tree invariant.

Why is the successor of a BST node defined as the one larger than the deleted one?

In the following image, if I add 14 to the right side of 12, then 14 can replace the 15 without influencing other nodes, just like the correct answer 16. Why the successor is defined to use the number that is bit larger than it not the one that is a bit smaller?

Well, in terms of language, successor is the one that comes right after, implying that it must be bigger.
In terms of the deleting algorithm, you can use both the successor and the predecessor to replace the deleted node.
Successor: is the smallest node in the right subtree of the deleted node, which means that it is the smallest node that is bigger than the deleted node, so you guarantee that, if you replace the deleted node with it, it will still be smaller than every other node in the right subtree, so it won't break any property.
Predecessor: is the biggest node in the left subtree of the deleted node, which means that it is the biggest node that is smaller than the deleted node, so you guarantee that, if you replace the deleted node with it, it will still be bigger than every other node in the left subtree, so it won't break any property.
In a nutshell, you can use the successor or the predecessor without any problems, is not a question of definition, only a question of choice.

Binary Search Trees-Deletion

In the case of Binary Search Trees why cannot we simply put the predecessor in place of the successor of a node in deletion case where a node is having two children?

We'd like to delete such a node with minimum amount of work and disruption to the structure of the tree.
Suppose we want to delete the node containing 6 from the following tree:
The standard solution is based on this idea: we leave the node containing 6 exactly where it is, but we get rid of the value 6 and find another value to store in the 6 node. This value is taken from a node below the 6s node, and it is that node that is actually removed from the tree.
Now, what value can we move into the vacated node and have a binary search tree? Well, here's how to figure it out. If we choose value X, then:
everything in the left subtree must be smaller than X.
everything in the right subtree must be bigger than X.
Let's suppose we're going to get X from the left subtree. (2) is guaranteed because everything in the left subtree is smaller than everything in the right subtree. What about (1)? If X is coming from the left subtree, (1) says that there is a unique choice for X - we must choose X to be the largest value in the left subtree. In our example, 3 is the largest value in the left subtree. So if we put 3 in the vacated node and delete it from its current position we will have a BST with 6 deleted.
The result is :

why cannot we simply put the predecessor in place of the successor of a node in deletion
case where a node is having two children?
We can put both and it is not necessary to replace the deleted node with the inorder successor. This is because in either case, the general contract of a BST is maintained.
Case1. Replace the deleted node with the inorder successor.
This is done by finding the leftmost node in the deleted node's right subtree.
Case2. Replace the deleted node with the inorder predecessor.
This is done by finding the rightmost node in the deleted node's left subtree.
Note that both these cases will keep all the elements in the left subtree smaller and all elements in right subtree greater than the element that we have brought into the position of the deleted node.

How to find all elements in an ordered dictionary implemented as a binary tree that have the key k

Given a binary tree I need to implement a method findAllElements(k) to find all the elements in the tree with a key equal to k.
The idea I had is the first time you come across an element with key k. All the elements with the same key should be either in the left child's right subtree or the right child's left subtree. But I was told this may not be the case?
I just need to find a way to implement an algorithm. So pseudo code is needed.
I probably should have added this sorry. But the implementation is that the left subtree contains keys less than or equal to the key at the root and the right subtree contains keys greater than or equal to the key at the root.

It depends on your tree implementation, by binary tree I assume you mean binary search tree, and you use operator< to compare the key. That is, The left subtree of a node contains only nodes with keys less(<) than the node's key, and the right subtree of a node contains only nodes with keys not less(!<) than the node's key.
e.g.
7
/ \
4 7
/ \
6 8
If there is multi equal keys in the tree, do this
k < current_node_key, search left subtree
k > current_node_key, search right subtree
k == current_node_key, record current node , then search right tree

Look at the current node. If its key is higher than k, search the left subtree. If it is lower, search the right subtree. If it is equal, search both left and right subtrees (and also include the current node in the results).
Do that recursively starting from the root node.

Thought I'd come back and explain what the result should have been after conversing with me teacher. So if you perform a method findElement(k) that will find an element with the key equal to k, the element it find should be the element highest in the tree with key k (let's denote this element V).
Then from this element V, other elements the contain a key=k will either be in the left child subtree (particularly all the way to the right) or the right child subtree (particularly all the way to the left). So for the left child keep going to the next nodes right child until an element with key=k is found...now... every element in the subtree with this node as its root must have a key=k (this is the part i didn't recognize at first) thus ANY kind of traversal of this full subtree can be done to find and store all the elements in this subtree (visiting every node in it). This type of thing must be repeated for the right child but visiting every left child until an element with a key=k is found. then the subtree with this element as its root has all the other elements with key=k in it and they can be found by one again fully traversing this subtree.
That is just a word description of it obviously, sorry for the length, and any confusion. Hopefully this will help anyone else trying to solve a similar problem.

A tree data structure where every left node is greater than every right node in a max situation?

Does such a tree exist and have a name, or is it just a figment of my imagination? I used to think heaps have this property but it just seems that the only requirement is for the children to be less than the parent.

It's exactly the opposite, but you may be thinking of a binary search tree, which has the following properties:
The left subtree of a node contains only nodes with keys less than the node's key.
The right subtree of a node contains only nodes with keys greater than the node's key.
Both the left and right subtrees must also be binary search trees.
There must be no duplicate nodes.
So every left node is guaranteed to be less than every right node. You can find the max by going right from the root node until you can't go right any more.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio