Time complexity of finding k successors in BST - algorithm

Given a binary search tree (BST) of height h, it would take O(k+h) time to apply the BST InOrder Successor algorithm k times in succession, starting in any node, applying each next call on the node that was returned by the previous call.
Pseudo code:
get_kth_successor(node):
for times = 1 to k:
node = successor(node)
return node
How can I prove this time complexity?
In particular, I am trying to establish a relation between k and the number of nodes visited, but can't find any pattern here.

Take the following truths concerning a successor(s) traversal:
You can traverse a branch not more than two times: downward and upward.
Every double visit of a branch corresponds to finding at least one more successor: when you backtrack via a branch upwards, you will have visited at least one successor more than at the time you passed that same branch the first time, in the downward direction.
The number of branches you will traverse only once cannot be more than 2h. This worst case happens when you start at a leaf in the bottom-left side of the tree and must go all the way up to the root (a successor) and then down again to a bottom-leaf to find the successor of the root. But if you need more successors after that, you will have to visit some of these branches again (in backtracking) before you can traverse other branches for the first time. So the total number of branches you traverse only once cannot increase above 2h.
So, to find k successors you will at the most traverse k branches twice (downward and upward, cf. point 2) and 2h branches once (point 3), which comes down to a worst case branch-traversal-count of 2k+2h which is O(h+k).

I am going to write the full implementation for the problem in order to make it easy to prove my arguments about the time being taken.
.
Assuming that each node of BST has the following structure:
typedef struct node{
int vale;
struct node* left;
struct node* right;
}node;
The algorithm will have 2 steps:
a. Start from the root node of the Tree and find the starting node and all the ancestor of that node. Store all this in the stack passed:
//root -> root node of the tree.
//val -> value of the node to find.
// s -> stack to store all ancestor.
node* find_node(node* root, int val,std::stack<node*> &s)
{
if(root != NULL) s.push(root);
if(root == NULL || root->value == val) return root;
if(root->value > val) return find_node(root->left);
else return find_node(root->right);
}
and the call to this method would look like:
//Assuming that the root is the root node of the tree.
std::stack<node*> s;
node* ptr = find_node(root,s); // we have all ancestors of ptr along with ptr in stack s now.
b. Now we have to print next k immediate bigger (than ptr) elements of the tree. We will start from the node (which is ptr) :
// s -> stack of all ancestor of the node.
// k -> k-successor to find.
void print_k_bigger(stack<node*> s, int k)
{
//since the top element of stack is the starting node. So we won't print it.
// We will just pop the first node and perform inorder traversal on its right child.
prev = s.top();
s.pop();
inorder(prev->right,k);
// Now all the nodes present in the stack are ancestor of the node.
while(!s.empty() && k>0)
{
//pop the node at the top of stack.
ptr = s.top();
s.pop();
//if the node popped previously (prev) was the right child of the current
//node, i.e. prev was bigger than the current node. So we will have to go
//one more level up to search for successors.
if(prev == ptr->right) continue;
//else the current node is immidiate bigger than prev node. Print it.
printf("%d",ptr->value);
//reduce the count.
k--;
//inorder the right subtree of the current node.
inorder(ptr->right);
//Note this.
prev = ptr;
}
}
Here is how our inorder will look like:
void inorder(node* ptr, int& k)
{
if(ptr != NULL)
{
inorder(ptr->left,k);
printf("%d",ptr->value);
k--;
inorder(ptr->right,k);
}
}
Time Analysis:
The method find_node is O(h) as it can go upto the length max root to leaf path.
The method print_k_bigger is O(h+k) as in every iteration of the loop, either the size of stack is reducing or the value of k is reducing. Note that when inorder() is called from inside the while loop, it won't raise additional overhead as all the calls to inorder() together will take at max O(k).

Here is a very simple algorithm to do this.
Create an empty stack S and a variable curr = NULL.
Find the starting node in the tree. Also, push all of it's ancestors (and the node itself) into stack S
Now pop a node top from the stack S and check if curr is it's right child. If it is not do an inorder traversal of it's right subtree
If we find k or more nodes while traversing, we are done
If we haven't discovered k successors yet, set curr = top, and repeat 2 till we find k nodes
Overall time compleity is O(h+k). Step 1 takes O(h) time. Step 2 takes O(h+k) time (all iterations of step 2 combined take O(h+k) time.)!

Related

Computing the nodes of a left skewed tree faster

I have to traverse a tree, in which I need to count the total number of nodes. So, in an easier way, I can traverse the tree and do some count++ to count the nodes of the tree. But this is very time-consuming for a tree having millions of nodes. It will take O(N) time where N is the number of nodes. I want to reduce the time complexity of this approach. How can I do that? For your reference, I am sharing the pseudo-code of my idea.
struct Node{
Node* left;
Node* right;
}
int traverse(Node* node){
if (node == null) return 0; //base case
count++;
count += traverse (node->left) //recursive call
count += traverse (node->right) //recursive call
return count;
}
Also please let me know if the above approach is going to be work or not? If not then why? And please suggest any faster approach.
There is no other way then to count the nodes, i.e. you have to visit them. But you have an undeclared variable count in your recursive function. Instead just start from scratch in each call, as you will only return the count of nodes below that node (and 1 to account for the node itself), irrespective of the other surroundings of that node:
int traverse(Node* node){
if (node == null) return 0; //base case
return 1 + traverse(node->left) + traverse(node->right);
}
But if you are going to count nodes repeatedly, at different states of the tree, then you may benefit from storing the count of nodes in each subtree, and keep that information updated with every mutation of the tree. So you would store in each node instance, the number of nodes in the subtree of which it is the root (including itself).
struct Node{
Node* left;
Node* right;
int count;
}
With every insert operation, you increment the count member of all the ancestors of that node, and give the node itself a count of 1. This represents O(logn) time complexity which does not increase the time complexity you probably already have for the insert operation.
With every delete operation, you decrement the count member of all the ancestors of that deleted node. This represents O(logn) time complexity which does not increase the time complexity you probably already have for the delete operation.
When you need to get the count of the whole tree, just read out the root node's count value. This is then just a O(1) operation.

Inorder successor of BST without parent info

I was trying to find inorder successor for a node in a BST. And below is my sample code.
public TreeNode getInorderSuccesor(TreeNode t)
{
if(t == null)
{
return null;
}
if(t.getRight()!=null)
{
t = t.getRight();
while(t.getLeft()!=null)
{
t = t.getLeft();
}
return t;
}
else
{
TreeNode parent = t.getParent();
while(parent!=null && parent.getLeft() != t)
{
t = parent;
parent = t.getParent();
}
return parent;
}
}
Please let me know if any condition it will fail. And one more thing if some can share algo/ code/ sudocode for find the inorder successor without parent node. thanks!!!
Here is the basic logic for the successor function; the predecessor function is similar, but reversed. A recursive function descends the tree searching for the requested key, keeping track of the current tree as the possible successor every time if follows the left sub-tree. If it finds the key, its successor is the minimum element of the right sub-tree, found by chasing left sub-trees until reaching a null node. If the search reaches a null node and hence fails to find the key, the next largest key in the tree is the successor, found in the possible successor that was saved at each recursive call. I recently implemented this algorithm at my blog.
If the nodes in the tree have unique values, then it is quite simple. You just need to find:
The node with largest number that is smaller than or equal to the value of the node whose successor needs to be determined
The node with the smallest number that is strictly larger than the value of the node whose successor needs to be determined.
The latter node will be the answer. The first node can be used to check whether the input node is a valid node in the tree.
Pseudo-code:
function inOrderSuccessor(val, tree):
(left, right) = findRange(val, tree.rootnode, null, null)
if (left.val != val):
throw Error("No node with specified value")
return right
function findRange(val, currnode, left, right):
if (currnode == null):
return (left, right)
if (currnode.val <= val):
return findRange(val, currNode.right, currnode, right)
else:
return findRange(val, currNode.right, left, currnode)
If the nodes in the tree may have duplicate value, the in-order successor is not very well-defined. It is necessary to know when you are given a node, how do you differentiate between different nodes with the same value. Are you given a real reference to the node in the tree?
In the tricky case where you are simply given a reference to a node, and there are other nodes with the same value in the tree, then there are 2 cases:
Case 1: The node has right subtree, then you only need to find the smallest element in the right subtree.
Case 2: The node doesn't have right subtree. You need to check reference equality with all the nodes which has the same value. When you have found the node in the tree, traverse upward from the node back to root (you have this in the stack), and find the first node that does a left-turn.

Deleting all nodes in a binary tree using O(1) auxiliary storage space?

The standard algorithm for deleting all nodes in a binary tree uses a postorder traversal over the nodes along these lines:
if (root is not null) {
recursively delete left subtree
recursively delete right subtree
delete root
}
This algorithm uses O(h) auxiliary storage space, where h is the height of the tree, because of the space required to store the stack frames during the recursive calls. However, it runs in time O(n), because every node is visited exactly once.
Is there an algorithm to delete all the nodes in a binary tree using only O(1) auxiliary storage space without sacrificing runtime?
It is indeed possible to delete all the nodes in a binary tree using O(n) and O(1) auxiliary storage space by using an algorithm based on tree rotations.
Given a binary tree with the following shape:
u
/ \
v C
/ \
A B
A right rotation of this tree pulls the node v above the node u and results in the following tree:
v
/ \
A u
/ \
B C
Note that a tree rotation can be done in O(1) time and space by simply changing the root of the tree to be v, setting u's left child to be v's former right child, then setting v's right child to be u.
Tree rotations are useful in this context because a right rotation will always decrease the height of the left subtree of the tree by one. This is useful because of a clever observation: it is extremely easy to delete the root of the tree if it has no left subchild. In particular, if the tree is shaped like this:
v
\
A
Then we can delete all the nodes in the tree by deleting the node v, then deleting all the nodes in its subtree A. This leads to a very simple algorithm for deleting all the nodes in the tree:
while (root is not null) {
if (root has a left child) {
perform a right rotation
} else {
delete the root, and make the root's right child the new root.
}
}
This algorithm clearly uses only O(1) storage space, because it needs at most a constant number of pointers to do a rotation or to change the root and the space for these pointers can be reused across all iterations of the loop.
Moreover, it can be shown that this algorithm also runs in O(n) time. Intuitively, it's possible to see this by looking at how many times a given edge can be rotated. First, notice that whenever a right rotation is performed, an edge that goes from a node to its left child is converted into a right edge that runs from the former child back to its parent. Next, notice that once we perform a rotation that moves node u to be the right child of node v, we will never touch node u again until we have deleted node v and all of v's left subtree. As a result, we can bound the number of total rotations that will ever be done by noting that every edge in the tree will be rotated with its parent at most once. Consequently, there are at most O(n) rotations done, each of which takes O(1) time, and exactly n deletions done. This means that the algorithm runs in time O(n) and uses only O(1) space.
In case it helps, I have a C++ implementation of this algorithm, along with a much more in-depth analysis of the algorithm's behavior. It also includes formal proofs of correctness for all of the steps of the algorithm.
Hope this helps!
Let me start with a serious joke: If you set the root of a BST to null, you effectively delete all the nodes in the tree (the garbage collector will make the space available). While the wording is Java specific, the idea holds for other programming languages. I mention this just in case you were at a job interview or taking an exam.
Otherwise, all you have to do is use a modified version of the DSW algorithm. Basically turn the tree into a backbone and then delete as you would a linked list. Space O(1) and time O(n). You should find talks of DSW in any textbook or online.
Basically DSW is used to balance a BST. But for your case, once you get the backbone, instead of balancing, you delete like you would a linked list.
Algorithm 1, O(n) time and O(1) space:
Delete node immediately unless it has both children. Otherwise get to the leftmost node reversing 'left' links to ensure all nodes are reachable - the leftmost node becomes new root:
void delete_tree(Node *node) {
Node *p, *left, *right;
for (p = node; p; ) {
left = p->left;
right = p->right;
if (left && right) {
Node *prev_p = nullptr;
do {
p->left = prev_p;
prev_p = p;
p = left;
} while ((left = p->left) != nullptr);
p->left = p->right;
p->right = prev_p; //need it on the right to avoid loop
} else {
delete p;
p = (left) ? left : right;
}
}
}
Algorithm 2, O(n) time and O(1) space: Traverse nodes depth-first, replacing child links with links to parent. Each node is deleted on the way up:
void delete_tree(Node *node) {
Node *p, *left, *right;
Node *upper = nullptr;
for (p = node; p; ) {
left = p->left;
right = p->right;
if (left && left != upper) {
p->left = upper;
upper = p;
p = left;
} else if (right && right != upper) {
p->right = upper;
upper = p;
p = right;
} else {
delete p;
p = upper;
if (p)
upper = (p->left) ? p->left : p->right;
}
}
}
I'm surprised by all the answers above that require complicated operations.
Removing nodes from a BST with O(1) additional storage is possible by simply replacing all recursive calls with a loop that searches for the node and also keeps track the current node's parent. Using recursion is only simpler because the recursive calls automatically store all ancestors of the searched node in a stack. However, it's not necessary to store all ancestors. It's only necessary to store the searched node and its parent, so the searched node can be unlinked. Storing all ancestors is simply a waste of space.
Solution in Python 3 is below. Don't be thrown off by the seemingly recursive call to delete --- the maximum recursion depth here is 2 since the second call to delete is guaranteed to result in the delete base case (root node containing the searched value).
class Tree(object):
def __init__(self, x):
self.value = x
self.left = None
self.right = None
def remove_rightmost(parent, parent_left, child):
while child.right is not None:
parent = child
parent_left = False
child = child.right
if parent_left:
parent.left = child.left
else:
parent.right = child.left
return child.value
def delete(t, q):
if t is None:
return None
if t.value == q:
if t.left is None:
return t.right
else:
rightmost_value = remove_rightmost(t, True, t.left)
t.value = rightmost_value
return t
rv = t
while t is not None and t.value != q:
parent = t
if q < t.value:
t = t.left
parent_left = True
else:
t = t.right
parent_left = False
if t is None:
return rv
if parent_left:
parent.left = delete(t, q)
else:
parent.right = delete(t, q)
return rv
def deleteFromBST(t, queries):
for q in queries:
t = delete(t, q)
return t

In a BST two nodes are randomly swapped we need to find those two nodes and swap them back

Given a binary search tree, in which any two nodes are swapped. So we need to find those two nodes and swap them back(we need to swap the nodes, not the data)
I tried to do this by making an additional array which has the inorder traversal of the tree and also saves the pointer to each node.
Then we just traverse the array and find the two nodes that are not in the sorted order, now these two nodes are searched in the tree and then swapped
So my question is how to solve this problem in O(1) space ?
In order traversal on a BST gives you a traversal on the elements, ordered.
You can use an in-order traversal, find the two out of place elements (store them in two pointers) and swap them back.
This method is actually creating the array you described on the fly, without actually creating it.
Note that however, space consumption is not O(1), it is O(h) where h is the height of the tree, due to the stack trace. If the tree is balanced, it will be O(logn)
Depending on the BST this can be done in O(1).
For instance, if the nodes of the tree have a pointer back to their parents you can use the implementation described in the nonRecrusiveInOrderTraversal section of the Wikipedia page for Tree_traversal.
As another possibility, if the BST is stored as a 1-dimensional array, instead of with pointers between nodes, you can simply walk over the array (and do the math to determine each node's parent and children).
While doing the traversal/walk, check to see if the current element is in the correct place.
If it isn't, then you can just see where in the tree it should be, and swap with the element in that location. (take some care for if the root is in the wrong place).
We can modified the isBST() method as below to swap those 2 randomly swapped nodes and correct it.
/* Returns true if the given tree is a binary search tree
(efficient version). */
int isBST(struct node* node)
{
struct node *x = NULL;
return(isBSTUtil(node, INT_MIN, INT_MAX,&x));
}
/* Returns true if the given tree is a BST and its
values are >= min and <= max. */
int isBSTUtil(struct node* node, int min, int max, struct node **x)
{
/* an empty tree is BST */
if (node==NULL)
return 1;
/* false if this node violates the min/max constraint */
if (node->data < min || node->data > max) {
if (*x == NULL) {
*x = node;//same the first incident where BST property is not followed.
}
else {
//here we second node, so swap it with *x.
int tmp = node->data;
node->data = (*x)->data;
(*x)->data = tmp;
}
//return 0;
return 1;//have to return 1; as we are not checking here if tree is BST, as we know it is not BST, and we are correcting it.
}
/* otherwise check the subtrees recursively,
tightening the min or max constraint */
return
isBSTUtil(node->left, min, node->data-1, x) && // Allow only distinct values
isBSTUtil(node->right, node->data+1, max, x); // Allow only distinct values
}

balanced binary tree visit with a twist

Looking for an in-place O(N) algorithm which prints the node pair(while traversing the tree) like below for a given balanced binary tree:
a
b c
d e f g
Output: bc, de, ef, fg
Where 'a' is the root node, 'b' the left child, 'c' the right, etc.
Please note the pair 'ef' in the output.
Additional info based on comments below:
'node pair' is each adjacent pair at each level
each node can have 'parent' pointer/reference in addition to 'left' and 'right'
If we were to relax O(N) and in-place, are there more simple solutions (both recursive and iterative)?
If this tree is stored in an array, it can be rearranged to be "level continuous" (the first element is the root, the next two its children, the next four their children, etc), and the problem is trivial.
If it is stored in another way, it becomes problematic. You could try a breadth-first traversal, but that can consume O(n) memory.
Well I guess you can create an O(n log n) time algorithm by storing the current level and the path to the current element (represented as a binary number), and only storing the last visited element to be able to create pairs. Only O(1 + log n) memory. -> This might actually be an O(n) algorithm with backtracking (see below).
I know there is an easy algorithm that visits all nodes in order in O(n), so there might be a trick to visit "sister" nodes in order in O(n) time. O(n log n) time is guaranteed tho, you could just stop at a given level.
There is a trivial O(n log n) algorithm as well, you just have to filter the nodes for a given level, increasing the level for the next loop.
Edit:
Okay, I created a solution that runs in O(n) with O(1) memory (two machine word sized variables to keep track of the current and maximal level /which is technically O(log log n) memory, but let's gloss over that/ and a few Nodes), but it requires that all Nodes contain a pointer to their parent. With this special structure it is possible to do an inorder traversal without an O(n log n) stack, only using two Nodes for stepping either left, up or right. You stop at a particular maximum level and never go below it. You keep track of the actual and the maximum level, and the last Node you encountered on the maximum level. Obviously you can print out such pairs if you encounter the next at the max level. You increase the maximum level each time you realize there is no more on that level.
Starting from the root Node in an n-1 node binary tree, this amounts to 1 + 3 + 7 + 15 + ... + n - 1 operations. This is 2 + 4 + 8 + 16 + ... + n - log2n = 2n - log2n = O(n) operations.
Without the special Node* parent members, this algorithm is only possible with O(log n) memory due to the stack needed for the in-order traversal.
Assuming you have the following structure as your tree:
struct Node
{
Node *left;
Node *right;
int value;
};
You can print out all pairs in three passes modifying the tree in place. The idea is to link nodes at the same depth together with their right pointer. You traverse down by following left pointers. We also maintain a count of expected nodes for each depth since we don't null terminate the list for each depth. Then, we unzip to restore the tree to its original configuration.
The beauty of this is the zip_down function; it "zips" together two subtrees such that the right branch of the left subtree points to the left branch of the right subtree. If you do this for every subtree, you can iterate over each depth by following the right pointer.
struct Node
{
Node *left;
Node *right;
int value;
};
void zip_down(Node *left, Node *right)
{
if (left && right)
{
zip_down(left->right, right->left);
left->right= right;
}
}
void zip(Node *left, Node *right)
{
if (left && right)
{
zip(left->left, left->right);
zip_down(left, right);
zip(right->left, right->right);
}
}
void print_pairs(const Node * const node, int depth)
{
int count= 1 << depth;
for (const Node *node_iter= node; count > 1; node_iter= node_iter->right, --count)
{
printf("(%c, %c) ", node_iter->value, node_iter->right->value);
}
if (node->left)
{
print_pairs(node->left, depth + 1);
}
}
void generate_tree(int depth, Node *node, char *next)
{
if (depth>0)
{
(node->left= new Node)->value= (*next)++;
(node->right= new Node)->value= (*next)++;
generate_tree(depth - 1, node->left, next);
generate_tree(depth - 1, node->right, next);
}
else
{
node->left= NULL;
node->right= NULL;
}
}
void print_tree(const Node * const node)
{
if (node)
{
printf("%c", node->value);
print_tree(node->left);
print_tree(node->right);
}
}
void unzip(Node * const node)
{
if (node->left && node->right)
{
node->right= node->left->right;
assert(node->right->value - node->left->value == 1);
unzip(node->left);
unzip(node->right);
}
else
{
assert(node->left==NULL);
node->right= NULL;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
char value_generator= 'a';
Node root;
root.value= value_generator++;
generate_tree(2, &root, &value_generator);
print_tree(&root);
printf("\n");
zip(root.left, root.right);
print_pairs(&root, 0);
printf("\n");
unzip(&root);
print_tree(&root);
printf("\n");
return 0;
}
EDIT4: in-place, O(n) time, O(log n) stack space.

Resources