BST, finding next highest node - data-structures

In BST, according to Programming Interviews Exposed
"Given a node, you can even find the next highest node in O(log(n)) time" Pg 65
A node in BST has right child as the next highest node, then why O(log(n))? Please correct
First answer the question, then negate it

With regard to your comment "A node in BST has right child as the next highest node" (assuming here "next highest" means the next sequential value) - no, it doesn't.
That can be the case if the right child has no left sub-tree, but it's not always so.
The next sequential value (I'm using that term rather than "highest" since the latter could be confused with tree height and "largest" implies a specific (low-to-high) order rather than any order) value comes from one of two places.
First, if the current node has a right child, move to that right child then, as long as you can see a left child, move to it.
In other words, with S and D as the source (current) and destination (next largest):
S
/ \
x x <- This is the node your explanation chose,
/ \ but it's the wrong one in this case.
x x
/
D <----- This is the actual node you want.
\
x
Otherwise (i.e., if the current node has no right child), you need to move up to the parent continuously (so nodes need a right, left and parent pointer) until the node you moved from was a left child. If you get to the root and you still haven't moved up from a left child, your original node was already the highest in the tree.
Graphically that entire process is illustrated with:
x
\
D <- Walking up the tree until you came up
/ \ from a left node.
x x
\
x
/ \
x S
/
x
The pseudo-code for such a function (that covers both those cases) would be:
def getNextNode (node):
# Case 1: right once then left many.
if node.right != NULL:
node = node.right
while node.left != NULL:
node = node.left
return node
# Case 2: up until we come from left.
while node.parent != NULL:
if node.parent.left == node:
return node.parent
node = node.parent
# Case 3: we never came from left, no next node.
return NULL
Since the effort is proportional to the height of the tree (we either go down, or up then down), a balanced tree will have a time complexity of O(log N) since the height has a logN relationship to the number of items.
The book is talking about balanced trees here, because it includes such snippets about them as:
This lookup is a fast operation because you eliminate half the nodes from your search on each iteration.
Lookup is an O(log(n)) operation in a binary search tree.
Lookup is only O(log(n)) if you can guarantee that the number of nodes remaining to be searched will be halved or nearly halved on each iteration.
So, while it admits in that last quote that a BST may not be balanced, the O(log N) property is only for those variants that are.
For non-balanced trees, the complexity (worst case) would be O(n) as you could end up with degenerate trees like:
S D
\ /
x x
\ \
x x
\ \
x x
\ \
x x
/ \
D S

I think, We can find the next highest node by simply finding the Inorder Successor of the node.
Steps -
Firstly, go to the right child of the node.
Then move as left as possible. When you reach the leaf node, print that leaf node as that node is your next highest node compared to the given node.

Here is my pseudo implementation in Java. Hope it helps.
Structure of Node
public Class Node{
int value {get, set};
Node leftChild {get,set};
Node rightChild{get, set};
Node parent{get,set};
}
Function to find next highest node
public Node findNextBiggest(Node n){
Node temp=n;
if(n.getRightChild()==null)
{
while(n.getParent()!=null && temp.getValue()>n.getParent().getValue())
{
n=n.getParent():
}
return n.getParent();
}
else
{
n=n.getRightChild();
while (n.getLeftChild()!=null && n.getLeftChild().getValue()>temp.getValue())
{
n=n.getLeftChild();
}
return n;
}
}

Related

Calculate the number of nodes on either side of an edge in a tree

A tree here means an acyclic undirected graph with n nodes and n-1 edges. For each edge in the tree, calculate the number of nodes on either side of it. If on removing the edge, you get two trees having a and b number of nodes, then I want to find those values a and b for all edges in the tree (ideally in O(n) time).
Intuitively I feel a multisource BFS starting from all the "leaf" nodes would yield an answer, but I'm not able to translate it into code.
For extra credit, provide an algorithm that works in any general graph.
Run a depth-first search (or a breadth-first search if you like it more) from any node.
That node will be called the root node, and all edges will be traversed only in the direction from the root node.
For each node, we calculate the number of nodes in its rooted subtree.
When a node is visited for the first time, we set this number to 1.
When the subtree of a child is fully visited, we add the size of its subtree to the parent.
After this, we know the number of nodes on one side of each edge.
The number on the other side is just the total minus the number we found.
(The extra credit version of your question involves finding bridges in the graph on top of this as a non-trivial part, and thus deserves to be asked as a separate question if you are really interested.)
Consider the following tree:
1
/ \
2 3
/ \ | \
5 6 7 8
If we cut the edge between node 1 and 2, The tree will surely split into two tree because there is only one unique edge between two nodes according to tree property:
1
\
3
| \
7 8
and
2
/ \
5 6
So, now a is the number of nodes rooted at 1 and b is number of nodes rooted at 2.
> Run one DFS considering any node as root.
> During DFS, for each node x, calculate nodes[x] and parent[x] where
nodes [x] = k means number of nodes of sub-tree rooted at x is k
parent[x] = y means y is parent of x.
> For any edge between node x and y where parent[x] = y:
a := nodes[root] - nodes[x]
b := nodes[x]
Time and space complexity both O(n).
Note that n=b-a+1. Due to this, you don't need to count both sides of the edge. This greatly simplifies things. A normal recursion over the nodes starting from the root is enough. Since your tree is undirected you don't really have a "root", just pick one of the leaves.
What you want to do is to "go down" the tree until you reach the bottom. Then you count backwards from there. The leaf returns 1, and each recursive step sums the return values for each edge and then increment by 1.
Here is the Java code. Function countEdges() takes in the adjacency list of the tree as an argument also current node and the parent node of the current node(here parent node means that current node was introduced by parent node in this DFS).
Here edge[][] stores the number of nodes on one side of the edge[i][j], obviously the number of nodes on the other side will be equal to (total nodes - edge[i][j]).
int edge[][];
int countEdges(ArrayList<Integer> adj[], int cur, int par) {
// If current nodes is leaf node and is not the node provided by the calling function then return 1
if(adj[cur].size() == 1 && par != 0) return 1;
int count = 1;
// count the number of nodes recursively for each neighbor of current node.
for(int neighbor: adj[cur]) {
if(neighbor == par) continue;
count += countEdges(adj, neighbor, cur);
}
// while returning from recursion assign the result obtained in the edge[][] matrix.
return edge[par][cur] = count;
}
Since we are visiting each node only once in the DFS time complexity should be O(V).

Runtime of the following recursive algorithm?

I am working through the book "Cracking the coding interview" by Gayle McDowell and came across an interesting recursive algorithm that sums the values of all the nodes in a balanced binary search tree.
int sum(Node node) {
if (node == null) {
return 0;
}
return sum(node.left) + node.value + sum(node.right);
}
Now Gayle says the runtime is O(N) which I find confusing as I don't see how this algorithm will ever terminate. For a given node, when node.left is passed to sum in the first call, and then node.right is consequently passed to sum in the second call, isn't the algorithm computing sum(node) for the second time around? Wouldn't this process go on forever? I'm still new to recursive algorithms so it might just not be very intuitive yet.
Cheers!
The process won't go on forever. The data structure in question is a Balanced Binary Search Tree and not a Graph which can contain cycles.
Starting from root, all the nodes will be explored in the manner - left -> itself -> right, like a Depth First Search.
node.left will explore the left subtree of a node and node.right will explore the right subtree of the same node. Both subtrees have nothing intersecting. Draw the trail of program control to see the order in which the nodes are explored and also to see that there is no overlapping in the traversal.
Since each node will be visited only once and the recursion will start unwinding when a leaf node will be hit, the running time will be O(N), N being the number of nodes.
The key to understanding a recursive algorithm is to trust that it does what it is deemed to. Let me explain.
First admit that the function sum(node) returns the sum of the values of all nodes of the subtree rooted at node.
Then the code
if (node == null) {
return 0;
}
return sum(node.left) + node.value + sum(node.right);
can do two things:
if node is null, return 0; this is a non-recursive case and the returned value is trivially correct;
otherwise, the fuction computes the sum for the left subtree plus the value at node plus the sum for the right subtree, i.e. the sum for the subtree rooted at node.
So in a way, if the function is correct, then it is correct :) Actually the argument isn't circular thanks to the non-recursive case, which is also correct.
We can use the same way of reasoning to prove the running time of the algorithm.
Assume that the time required to process the tree rooted at node is proportional to the size of this subtree, let |T|. This is another act of faith.
Then if node is null, the time is constant, let 1 unit. And if node isn't null, the time is |L| + 1 + |R| units, which is precisely |T|. So if the time to sum a subtree is proportional to the size of the subtree, the time to sum a tree is prortional to the size of the tree!

Counting number of nodes in a complete binary tree

I want to count the number of nodes in a Complete Binary tree but all I can think of is traversing the entire tree. This will be a O(n) algorithm where n is the number of nodes in the tree. what could be the most efficient algorithm to achieve this?
Suppose that we start off by walking down the left and right spines of the tree to determine their heights. We'll either find that they're the same, in which case the last row is full, or we'll find that they're different. If the heights come back the same (say the height is h), then we know that there are 2h - 1 nodes and we're done. (refer figure below for reference)
Otherwise, the heights must be h+1 and h, respectively. We know that there are then at least 2h - 1 nodes, plus the number of nodes in the bottom layer of the tree. The question, then, is how to figure that out. One way to do this would be to find the rightmost node in the last layer. If you know at which index that node is, you know exactly how many nodes are in the last layer, so you can add that to 2h - 1 and you're done.
If you have a complete binary tree with left height h+1, then there are between 1 and 2h - 1 possible nodes that could be in the last layer. The question is then how to determine this as efficiently as possible.
Fortunately, since we know the nodes in the last layer get filled in from the left to the right, we can use binary search to try to figure out where the last filled node in the last layer is. Essentially, we guess the index where it might be, walk from the root of the tree down to where that leaf should be, and then either find a node there (so we know that the rightmost node in the bottom layer is either that node or to the right) or we don't (so we know that the rightmost node in the bottom layer must purely be to the right of the current location). We can walk down to where the kth node in the bottom layer should be by using the bits in the number k to guide a search down: we start at the root, then go left if the first bit of k is 0 and right if the first bit of k is 1, then use the remaining bits in a corresponding manner to walk down the tree. The total number of times we'll do this is O(h) and each probe takes time O(h), so the total work done here is O(h2). Since h is the height of the tree, we know that h = O(log n), so this algorithm takes time O(log2 n) time to complete.
I'm not sure whether it's possible to improve upon this algorithm. I can get an Ω(log n) lower bound on any correct algorithm, though. I'm going to argue that any algorithm that is always correct in all cases must inspect the rightmost leaf node in the final row of the tree. To see why, suppose there's a tree T where the algorithm doesn't do this. Let's suppose that the rightmost node that the algorithm inspects in the bottom row is x, that the actual rightmost node in the bottom row is y, and that the leftmost missing node in the bottom row that the algorithm detected is z. We know that x must be to the left of y (because the algorithm didn't inspect the leftmost node in the bottom row) and that y must be to the left of z (because y exists and z doesn't, so z must be further to the right than y). If you think about what the algorithm's "knowledge" is at this point, the algorithm has no idea whether or not there are any nodes purely to the right of x or purely to the left of z. Therefore, if we were to give it a modified tree T' where we deleted y, the algorithm wouldn't notice that anything had changed and would have exactly the same execution path on T and T'. However, since T and T' have a different number of nodes, the algorithm has to be wrong on at least one of them. Inspecting this node takes time at least Ω(log n) because of the time required to walk down the tree.
In short, you can do better than O(n) with the above O(log2 n)-time algorithm, and you might be able to do even better than that, though I'm not entirely sure how or whether that's possible. I suspect it isn't because I suspect that binary search is the optimal way to check the bottom row and the lengths of the paths down to the nodes you'd probe, even after taking into account that they share nodes in common, is Θ(log2 n), but I'm not sure how to prove it.
Hope this helps!
Images source
public int leftHeight(TreeNode root){
int h=0;
while(root!=null){
root=root.left;
h++;
}
return h;
}
public int rightHeight(TreeNode root){
int h=0;
while(root!=null){
root=root.right;
h++;
}
return h;
}
public int countNodes(TreeNode root) {
if(root==null)
return 0;
int lh=leftHeight(root);
int rh=rightHeight(root);
if(lh==rh)
return (1<<lh)-1;
return countNodes(root.left)+countNodes(root.right)+1;
}
In each recursive call,we need to traverse along the left and right boundaries of the complete binary tree to compute the left and right height. If they are equal the tree is full with 2^h-1 nodes.Otherwise we recurse on the left subtree and right subtree. The first call is from the root (level=0) which take O(h) time to get left and right height.We have recurse till we get a subtree which is full binary tree.In worst case it can happen that the we go till the leaf node. So the complexity will be (h + (h-1) +(h-2) + ... + 0)= (h(h+1)/2)= O(h^2).Also space complexity is size of the call stack,which is O(h).
NOTE:For complete binary tree h=log(n).
If the binary tree is definitely complete (as opposed to 'nearly complete' or 'almost complete' as defined in the Wikipedia article) you should simply descend down one branch of the tree down to the leaf. This will be O(logn). Then sum the powers of two up to this depth. So 2^0 + 2^1... + 2^d
C# Sample might helps others. This is similar to the time complexity well explained above by templatetypedef
public int GetLeftHeight(TreeNode treeNode)
{
int heightCnt = 0;
while (treeNode != null)
{
heightCnt++;
treeNode = treeNode.LeftNode;
}
return heightCnt;
}
public int CountNodes(TreeNode treeNode)
{
int heightIndx = GetLeftHeight(treeNode);
int nodeCnt = 0;
while (treeNode != null)
{
int rightHeight = GetLeftHeight(treeNode.RightNode);
nodeCnt += (int)Math.Pow(2, rightHeight); //(1 << rh);
treeNode = (rightHeight == heightIndx - 1) ? treeNode.RightNode : treeNode.LeftNode;
heightIndx--;
}
return nodeCnt;
}
Using Recursion:
int countNodes(TreeNode* root) {
if (!root){
return 0;
}
else{
return countNodes(root->left)+countNodes(root->right)+1;
}
}

Find root of minimal height which is not BST

I'm struggling with following problem:
Write function which for given binary tree returns a root of minimal height which is not BST or NIL when tree is BST.
I know how to check if tree is BST but don't know how to rewrite it.
I would be grateful for an algorithm in pseudo code.
Rather than jumping right into an algorithm that works here, I'd like to give a series of observations that ultimately leads up to a really nice algorithm for this problem.
First, suppose that, for each node in the tree, you knew the value of the largest and smallest values in the subtree rooted at that node. (Let's denote these as min(x) and max(x), where x is a node in the tree). Given this information, we can make the following observation:
Observation 1: A node x is the root of a non-BST if x ≤ max(x.left) or if x ≥ min(y.right)
This is not an if-and-only-if condition - it's just an "if" - but it's a useful observation to have. The reason this works is that if x ≤ max(x.left), then there is a node in x's left subtree that's not smaller than x, meaning that the tree rooted at x isn't a BST, and if x > min(x.right), then there's a node in x's right subtree that's not larger than x, meaning that the tree rooted at x isn't a BST.
Now, it's not necessarily the case that any node where x < min(x.right) and x > max(x.left) is the root of a BST. Consider this tree, for example:
4
/ \
1 6
/ \
2 5
Here, the root node is larger than everything in its left subtree and smaller than everything in its right subtree, but the entire tree is itself not a BST. The reason for this is that the trees rooted at 1 and 6 aren't BSTs. This leads to a useful observation:
Observation 2: If x > max(x.left) and x < min(x.right), then x is a BST if and only if x.left and x.right are BSTs.
A quick sketch of a proof of this result: if x.left and x.right are BSTs, then doing an inorder traversal of the tree will list off all the values in x.left in ascending order, then x, then all the values in x.right in ascending order. Since x > max(x.left) and x < min(x.right), these values are sorted, so the tree is a BST. On the other hand, if either x.left or x.right are not BSTs, then the order in which these values come back won't be sorted, so the tree isn't a BST.
These two properties give a really nice way to find every node in the tree that isn't the root of a BST. The idea is to work through the nodes in the tree from the leaves upward, checking whether each node's value is greater than the max of its left subtree and less than the min of its right subtree, then checking whether its left and right subtrees are BSTs. You can do this with a postorder traversal, as shown here:
/* Does a postorder traversal of the tree, tagging each node with its
* subtree min, subtree max, and whether the node is the root of a
* BST.
*/
function findNonBSTs(r) {
/* Edge case for an empty tree. */
if (r is null) return;
/* Process children - this is a postorder traversal. This also
* tags each child with information about its min and max values
* and whether it's a BST.
*/
findNonBSTs(r.left);
findNonBSTs(r.right);
/* If either subtree isn't a BST, we're done. */
if ((r.left != null && !r.left.isBST) ||
(r.right != null && !r.right.isBST)) {
r.isBST = false;
return;
}
/* Otherwise, both children are BSTs. Check against the min and
* max values of those subtrees to make sure we're in range.
*/
if ((r.left != null && r.left.max >= r.value) ||
(r.right != null && r.right.min <= r.value)) {
r.isBST = false;
return;
}
/* Otherwise, we're a BST, and our min and max value can be
* computed from the left and right children.
*/
r.isBST = true;
r.min = (r.left != null? r.left.min : r.value);
r.max = (r.right != null? r.right.max : r.value);
}
One you've run this pass over the tree, every node will be tagged with whether it's a binary search tree or not. From there, all you have to do is make one more pass over the tree to find the deepest node that's not a BST. I'll leave that as a proverbial exercise for the reader. :-)
Hope this helps!

Deleting all nodes in a binary tree using O(1) auxiliary storage space?

The standard algorithm for deleting all nodes in a binary tree uses a postorder traversal over the nodes along these lines:
if (root is not null) {
recursively delete left subtree
recursively delete right subtree
delete root
}
This algorithm uses O(h) auxiliary storage space, where h is the height of the tree, because of the space required to store the stack frames during the recursive calls. However, it runs in time O(n), because every node is visited exactly once.
Is there an algorithm to delete all the nodes in a binary tree using only O(1) auxiliary storage space without sacrificing runtime?
It is indeed possible to delete all the nodes in a binary tree using O(n) and O(1) auxiliary storage space by using an algorithm based on tree rotations.
Given a binary tree with the following shape:
u
/ \
v C
/ \
A B
A right rotation of this tree pulls the node v above the node u and results in the following tree:
v
/ \
A u
/ \
B C
Note that a tree rotation can be done in O(1) time and space by simply changing the root of the tree to be v, setting u's left child to be v's former right child, then setting v's right child to be u.
Tree rotations are useful in this context because a right rotation will always decrease the height of the left subtree of the tree by one. This is useful because of a clever observation: it is extremely easy to delete the root of the tree if it has no left subchild. In particular, if the tree is shaped like this:
v
\
A
Then we can delete all the nodes in the tree by deleting the node v, then deleting all the nodes in its subtree A. This leads to a very simple algorithm for deleting all the nodes in the tree:
while (root is not null) {
if (root has a left child) {
perform a right rotation
} else {
delete the root, and make the root's right child the new root.
}
}
This algorithm clearly uses only O(1) storage space, because it needs at most a constant number of pointers to do a rotation or to change the root and the space for these pointers can be reused across all iterations of the loop.
Moreover, it can be shown that this algorithm also runs in O(n) time. Intuitively, it's possible to see this by looking at how many times a given edge can be rotated. First, notice that whenever a right rotation is performed, an edge that goes from a node to its left child is converted into a right edge that runs from the former child back to its parent. Next, notice that once we perform a rotation that moves node u to be the right child of node v, we will never touch node u again until we have deleted node v and all of v's left subtree. As a result, we can bound the number of total rotations that will ever be done by noting that every edge in the tree will be rotated with its parent at most once. Consequently, there are at most O(n) rotations done, each of which takes O(1) time, and exactly n deletions done. This means that the algorithm runs in time O(n) and uses only O(1) space.
In case it helps, I have a C++ implementation of this algorithm, along with a much more in-depth analysis of the algorithm's behavior. It also includes formal proofs of correctness for all of the steps of the algorithm.
Hope this helps!
Let me start with a serious joke: If you set the root of a BST to null, you effectively delete all the nodes in the tree (the garbage collector will make the space available). While the wording is Java specific, the idea holds for other programming languages. I mention this just in case you were at a job interview or taking an exam.
Otherwise, all you have to do is use a modified version of the DSW algorithm. Basically turn the tree into a backbone and then delete as you would a linked list. Space O(1) and time O(n). You should find talks of DSW in any textbook or online.
Basically DSW is used to balance a BST. But for your case, once you get the backbone, instead of balancing, you delete like you would a linked list.
Algorithm 1, O(n) time and O(1) space:
Delete node immediately unless it has both children. Otherwise get to the leftmost node reversing 'left' links to ensure all nodes are reachable - the leftmost node becomes new root:
void delete_tree(Node *node) {
Node *p, *left, *right;
for (p = node; p; ) {
left = p->left;
right = p->right;
if (left && right) {
Node *prev_p = nullptr;
do {
p->left = prev_p;
prev_p = p;
p = left;
} while ((left = p->left) != nullptr);
p->left = p->right;
p->right = prev_p; //need it on the right to avoid loop
} else {
delete p;
p = (left) ? left : right;
}
}
}
Algorithm 2, O(n) time and O(1) space: Traverse nodes depth-first, replacing child links with links to parent. Each node is deleted on the way up:
void delete_tree(Node *node) {
Node *p, *left, *right;
Node *upper = nullptr;
for (p = node; p; ) {
left = p->left;
right = p->right;
if (left && left != upper) {
p->left = upper;
upper = p;
p = left;
} else if (right && right != upper) {
p->right = upper;
upper = p;
p = right;
} else {
delete p;
p = upper;
if (p)
upper = (p->left) ? p->left : p->right;
}
}
}
I'm surprised by all the answers above that require complicated operations.
Removing nodes from a BST with O(1) additional storage is possible by simply replacing all recursive calls with a loop that searches for the node and also keeps track the current node's parent. Using recursion is only simpler because the recursive calls automatically store all ancestors of the searched node in a stack. However, it's not necessary to store all ancestors. It's only necessary to store the searched node and its parent, so the searched node can be unlinked. Storing all ancestors is simply a waste of space.
Solution in Python 3 is below. Don't be thrown off by the seemingly recursive call to delete --- the maximum recursion depth here is 2 since the second call to delete is guaranteed to result in the delete base case (root node containing the searched value).
class Tree(object):
def __init__(self, x):
self.value = x
self.left = None
self.right = None
def remove_rightmost(parent, parent_left, child):
while child.right is not None:
parent = child
parent_left = False
child = child.right
if parent_left:
parent.left = child.left
else:
parent.right = child.left
return child.value
def delete(t, q):
if t is None:
return None
if t.value == q:
if t.left is None:
return t.right
else:
rightmost_value = remove_rightmost(t, True, t.left)
t.value = rightmost_value
return t
rv = t
while t is not None and t.value != q:
parent = t
if q < t.value:
t = t.left
parent_left = True
else:
t = t.right
parent_left = False
if t is None:
return rv
if parent_left:
parent.left = delete(t, q)
else:
parent.right = delete(t, q)
return rv
def deleteFromBST(t, queries):
for q in queries:
t = delete(t, q)
return t

Resources