I'm trying to come up with an algorithm to construct a binary search tree using the elements from another binary search tree, but with the restriction that those elements have to be greater or equal than some given integer, let's call it x.
I thought of a recursive approach (using in order traversal):
binary_tree (bst tree, int x) {
if (tree is empty)
return empty;
if (tree->element>=x)
insert tree->element in a new BST;
else ????
}
I have no idea what the last recursive call would be, I obviously can't write two returns like this:
else
return (tree->left, x)
return (tree->right, x)
And I can't think of anything else, sorry if this is a silly question! I'm just starting with recursion and it's really confusing.
Lets think about what we are doing here. We want to construct a tree from an existing binary search tree. Because the existing tree is a BST we get some helpful info.
For any node V, if V <= x then the subtree pointed to by V -> left will have nodes all smaller than x. So we no longer need to look in the left subtree anymore. However if we hit a node that is greater than or equal to x we need to continue the recursion. Lets bring this all together in pseudo code
newBST(root):
if root is null
return
if root.val >= x
addNewNode(root.val)
newBST(root.right)
newBST(root.left)
else:
newBST(root.right)
It's a little tricky to do this recursively, because there isn't a 1-1 correspondence between subtrees in the tree you have and subtrees in the tree you want.
The simplest way to do this is to copy the values >= x into a list in order, and then build a tree from the list recursively.
The numbers 1 to n are inserted in a binary search tree in a specified order p_1, p_2,..., p_n. Describe an O(nlog n) time algorithm to construct the resulting final binary search tree.
Note that :-
I don't need average time n log n, but the worst time.
I need the the exact tree that results when insertion takes place with the usual rules. AVL or red black trees not allowed.
This is an assignment question. It is very very non trivial. In fact it seemed impossible at first glance. I have thought on it much. My observations:-
The argument that we use to prove that sorting takes atleast n log n time does not eliminate the existence of such an algorithm here.
If it is always possible to find a subtree in O(n) time whose size is between two fractions of the size of tree, the problem can be easily solved.
Choosing median or left child of root as root of subtree doesn't work.
The trick is not to use the constructed BST for lookups. Instead, keep an additional, balanced BST for lookups. Link the leaves.
For example, we might have
Constructed Balanced
3 2
/ \ / \
2 D 1 3
/ \ / | | \
1 C a b c d
/ \
A B
where a, b, c, d are pointers to A, B, C, D respectively, and A, B, C, D are what would normally be null pointers.
To insert, insert into the balanced BST first (O(log n)), follow the pointer to the constructed tree (O(1)), do the constructed insert (O(1)), and relink the new leaves (O(1)).
As David Eisenstat doesn't have time to extend his answer, I'll try to put more details into a similar algorithm.
Intuition
The main intuition behind the algorithm is based on the following statements:
statement #1: if a BST contains values a and b (a < b) AND there are no values between them, then either A (node for value a) is a (possibly indirect) parent of B (node for value b) or B is a (possibly indirect) parent of A.
This statement is obviously true because if their lowest common ancestor C is some other node than A and B, its value c must be between a and b. Note that statement #1 is true for any BST (balanced or unbalanced).
statement #2: if a simple (unbalanced) BST contains values a and b (a < b) AND there are no values between them AND we are trying to add value x such that a < x < b, then X (node for value x) will be either direct right (greater) child of A or direct left (less) child of B whichever node is lower in the tree.
Let's assume that the lower of two nodes is a (the other case is symmetrical). During insertion phase value x will travel the same path as a during its insertion because tree doesn't contain any values between a and x i.e. at any comparison values a and x are indistinguishable. It means that value x will navigate tree till node A and will pass node B at some earlier step (see statement #1). As x > a it should become a right child of A. Direct right child of A must be empty at this point because A is in B's subtree i.e. all values in that subtree are less than b and since there are no values between a and b in the tree, no value can be right child of node A.
Note that statement #2 might potentially be not true for some balanced BST after re-balancing was performed although this should be a strange case.
statement #3: in a balanced BST for any value x not in the tree yet, you can find closest greater and closest less values in O(log(N)) time.
This follows directly from statements #1 and #2: all you need is find the potential insertion point for the value x in the BST (takes O(log(N))), one of the two values will be direct parent of the insertion point and to find the other you need to travel the tree back to the root (again takes O(log(N))).
So now the idea behind the algorithm becomes clear: for fast insertion into an unbalanced BST we need to find nodes with closest less and greater values. We can easily do it if we additionally maintain a balanced BST with the same keys as our target (unbalanced) BST and with corresponding nodes from that BST as values. Using that additional data structure we can find insertion point for each new value in O(log(N)) time and update this data structure with new value in O(log(N)) time as well.
Algorithm
Init "main" root and balancedRoot with null.
for each value x in the list do:
if this is the first value just add it as the root nodes to both trees and go to #2
in the tree specified by balancedRoot find nodes that correspond to the closest less (BalancedA, points to node A in the main BST) and closest greater (BalancedB, points to node B in the main BST) values.
If there is no closest lower value i.e. we are adding minimum element, add it as the left child to the node B
If there is no closest greater value i.e. we are adding maximum element, add it as the right child to the node A
Find whichever of nodes A or B is lower in the tree. You can use explicit level stored in the node. If the lower node is A (less node), add x as the direct right child of A else add x as the direct left child of B (greater node). Alternatively (and more cleverly) you may notice that from the statements #1 and #2 follows that exactly one of the two candidate insert positions (A's right child or B's left child) will be empty and this is where you want to insert your value x.
Add value x to the balanced tree (might re-use from step #4).
Go to step #2
As no inner step of the loop takes more than O(log(N)), total complexity is O(N*log(N))
Java implementation
I'm too lazy to implement balanced BST myself so I used standard Java TreeMap that implements Red-Black tree and has useful lowerEntry and higherEntry methods that correspond to step #4 of the algorithm (you may look at the source code to ensure that both are actually O(log(N))).
import java.util.Map;
import java.util.TreeMap;
public class BSTTest {
static class Node {
public final int value;
public Node left;
public Node right;
public Node(int value) {
this.value = value;
}
public boolean compareTree(Node other) {
return compareTrees(this, other);
}
public static boolean compareTrees(Node n1, Node n2) {
if ((n1 == null) && (n2 == null))
return true;
if ((n1 == null) || (n2 == null))
return false;
if (n1.value != n2.value)
return false;
return compareTrees(n1.left, n2.left) &&
compareTrees(n1.right, n2.right);
}
public void assignLeftSafe(Node child) {
if (this.left != null)
throw new IllegalStateException("left child is already set");
this.left = child;
}
public void assignRightSafe(Node child) {
if (this.right != null)
throw new IllegalStateException("right child is already set");
this.right = child;
}
#Override
public String toString() {
return "Node{" +
"value=" + value +
'}';
}
}
static Node insertToBst(Node root, int value) {
if (root == null)
root = new Node(value);
else if (value < root.value)
root.left = insertToBst(root.left, value);
else
root.right = insertToBst(root.right, value);
return root;
}
static Node buildBstDirect(int[] values) {
Node root = null;
for (int v : values) {
root = insertToBst(root, v);
}
return root;
}
static Node buildBstSmart(int[] values) {
Node root = null;
TreeMap<Integer, Node> balancedTree = new TreeMap<Integer, Node>();
for (int v : values) {
Node node = new Node(v);
if (balancedTree.isEmpty()) {
root = node;
} else {
Map.Entry<Integer, Node> lowerEntry = balancedTree.lowerEntry(v);
Map.Entry<Integer, Node> higherEntry = balancedTree.higherEntry(v);
if (lowerEntry == null) {
// adding minimum value
higherEntry.getValue().assignLeftSafe(node);
} else if (higherEntry == null) {
// adding max value
lowerEntry.getValue().assignRightSafe(node);
} else {
// adding some middle value
Node lowerNode = lowerEntry.getValue();
Node higherNode = higherEntry.getValue();
if (lowerNode.right == null)
lowerNode.assignRightSafe(node);
else
higherNode.assignLeftSafe(node);
}
}
// update balancedTree
balancedTree.put(v, node);
}
return root;
}
public static void main(String[] args) {
int[] input = new int[]{7, 6, 9, 4, 1, 8, 2, 5, 3};
Node directRoot = buildBstDirect(input);
Node smartRoot = buildBstSmart(input);
System.out.println(directRoot.compareTree(smartRoot));
}
}
Here's a linear-time algorithm. (I said that I wasn't going to work on this question, so if you like this answer, please award the bounty to SergGr.)
Create a doubly linked list with nodes 1..n and compute the inverse of p. For i from n down to 1, let q be the left neighbor of p_i in the list, and let r be the right neighbor. If p^-1(q) > p^-1(r), then make p_i the right child of q. If p^-1(q) < p^-1(r), then make p_i the left child of r. Delete p_i from the list.
In Python:
class Node(object):
__slots__ = ('left', 'key', 'right')
def __init__(self, key):
self.left = None
self.key = key
self.right = None
def construct(p):
# Validate the input.
p = list(p)
n = len(p)
assert set(p) == set(range(n)) # 0 .. n-1
# Compute p^-1.
p_inv = [None] * n
for i in range(n):
p_inv[p[i]] = i
# Set up the list.
nodes = [Node(i) for i in range(n)]
for i in range(n):
if i >= 1:
nodes[i].left = nodes[i - 1]
if i < n - 1:
nodes[i].right = nodes[i + 1]
# Process p.
for i in range(n - 1, 0, -1): # n-1, n-2 .. 1
q = nodes[p[i]].left
r = nodes[p[i]].right
if r is None or (q is not None and p_inv[q.key] > p_inv[r.key]):
print(p[i], 'is the right child of', q.key)
else:
print(p[i], 'is the left child of', r.key)
if q is not None:
q.right = r
if r is not None:
r.left = q
construct([1, 3, 2, 0])
Here's my O(n log^2 n) attempt that doesn't require building a balanced tree.
Put nodes in an array in their natural order (1 to n). Also link them into a linked list in the order of insertion. Each node stores its order of insertion along with the key.
The algorithm goes like this.
The input is a node in the linked list, and a range (low, high) of indices in the node array
Call the input node root, Its key is rootkey. Unlink it from the list.
Determine which subtree of the input node is smaller.
Traverse the corresponding array range, unlink each node from the linked list, then link them in a separate linked list and sort the list again in the insertion order.
Heads of the two resulting lists are children of the input node.
Perform the algorithm recursively on children of the input node, passing ranges (low, rootkey-1) and (rootkey+1, high) as index ranges.
The sorting operation at each level gives the algorithm the extra log n complexity factor.
Here's an O(n log n) algorithm that can also be adapted to O(n log log m) time, where m is the range, by using a Y-fast trie rather than a balanced binary tree.
In a binary search tree, lower values are left of higher values. The order of insertion corresponds with the right-or-left node choices when traveling along the final tree. The parent of any node, x, is either the least higher number previously inserted or the greatest lower number previously inserted, whichever was inserted later.
We can identify and connect the listed nodes with their correct parents using the logic above in O(n log n) worst-time by maintaining a balanced binary tree with the nodes visited so far as we traverse the order of insertion.
Explanation:
Let's imagine a proposed lower parent, p. Now imagine there's a number, l > p but still lower than x, inserted before p. Either (1) p passed l during insertion, in which case x would have had to pass l to get to p but that contradicts that x must have gone right if it reached l; or (2) p did not pass l, in which case p is in a subtree left of l but that would mean a number was inserted that's smaller than l but greater than x, a contradiction.
Clearly, a number, l < x, greater than p that was inserted after p would also contradict p as x's parent since either (1) l passed p during insertion, which means p's right child would have already been assigned when x was inserted; or (2) l is in a subtree to the right of p, which again would mean a number was inserted that's smaller than l but greater than x, a contradiction.
Therefore, for any node, x, with a lower parent, that parent must be the greatest number lower than and inserted before x. Similar logic covers the scenario of a higher proposed parent.
Now let's imagine x's parent, p < x, was inserted before h, the lowest number greater than and inserted before x. Then either (1) h passed p, in which case p's right node would have been already assigned when x was inserted; or (2) h is in a subtree right of p, which means a number lower than h and greater than x was previously inserted but that would contradict our assertion that h is the lowest number inserted so far that's greater than x.
Since this is an assignment, I'm posting a hint instead of an answer.
Sort the numbers, while keeping the insertion order. Say you have input: [1,7,3,5,8,2,4]. Then after sorting you will have [[1,0], [2,5], [3,2], [4, 6], [5,3], [7,1], [8,4]] . This is actually the in-order traversal of the resulting tree. Think hard about how to reconstruct the tree given the in-order traversal and the insertion order (this part will be linear time).
More hints coming if you really need them.
A binary tree is given and we have to count the number of binary search trees in it.Every leaf node is a BST
I used the following approach.
for every node in bt check if it is bst or not
The time complexity for above approach is O(n2).How can we do it in an efficient way O(n).
If I understood the question correctly, this can be solved as follows; one would aim at counting the number of nodes which are the root of a binary seach tree. As already remarked, every leaf is trivially the root of a binary search tree. A non-leaf node a is the root of a binary search if and only if the left child of a is a binary search tree, the right child of b is the root of a binary search tree and the maximum over all values under the left child of a is not greater than the value of a and the minumum over all values under the right child of a are larger or equal to the value of a. Evaluation of this property can be done by a recursive evaluation which visits every node exactly once, which results in a linear runtime bound.
A straightforward recursive traversal of the tree returning a few extra pieces of data may help manage it in O(n) time, n being the number of nodes. Below you can find an implementation in Python.
numBST = 0
def traverse(root):
global numBST
leftComplies = True
rightComplies = True
rootRange = [root.val, root.val]
if root.left != None:
leftResult = traverse(root.left)
leftComplies = leftResult[0] and leftResult[1][1] < root.val
rootRange[0] = leftResult[1][0]
if root.right != None:
rightResult = traverse(root.right)
rightComplies = rightResult[0] and rightResult[1][0] > root.val
rootRange[1] = rightResult[1][1]
if leftComplies and rightComplies:
numBST += 1
return (leftComplies and rightComplies, rootRange)
After you run traverse with root of the binary tree as parameter, numBST will contain the number of BSTs within the root.
The function traverse given above recursively traverses the tree root of which is given to it as a parameter. For each node V, if V has a left child L, it recursively traverses the left child and returns some data. Specifically, it returns a list of length 2. The first element in the list is a boolean value indicating whether the left subtree rooted in L is a BST. Second element of the returned list contains another list containing the smallest and the largest value, respectively, in the subtree rooted in L.
For the tree rooted in V to be a BST, the subtree rooted in L must also be a BST AND the largest value in the subtree rooted in L(hence all the values in that subtree) must be smaller than the value stored in V. So after recursively calling traverse for L, we check the returned data to find out if these conditions are satisfied.
Similarly, if there is a right child R of V, it is recursively traversed. To be a BST, the tree rooted in V must also satisfy the condition that the tree rooted in R is a BST AND the smallest node of subtree rooted in R(hence all the nodes in that subtree) contains a value that is larger than the value stored in V.
If all these conditions are satisfied, the tree rooted in V can be considered as a BST and the result, stored in numBST, is updated accordingly. Note that we also update the smallest and largest values stored in V as we recursively traverse its children L and R, and perform the checks mentioned above, so that we pass the correctly updated values to the higher levels of recursion.
I'm struggling with following problem:
Write function which for given binary tree returns a root of minimal height which is not BST or NIL when tree is BST.
I know how to check if tree is BST but don't know how to rewrite it.
I would be grateful for an algorithm in pseudo code.
Rather than jumping right into an algorithm that works here, I'd like to give a series of observations that ultimately leads up to a really nice algorithm for this problem.
First, suppose that, for each node in the tree, you knew the value of the largest and smallest values in the subtree rooted at that node. (Let's denote these as min(x) and max(x), where x is a node in the tree). Given this information, we can make the following observation:
Observation 1: A node x is the root of a non-BST if x ≤ max(x.left) or if x ≥ min(y.right)
This is not an if-and-only-if condition - it's just an "if" - but it's a useful observation to have. The reason this works is that if x ≤ max(x.left), then there is a node in x's left subtree that's not smaller than x, meaning that the tree rooted at x isn't a BST, and if x > min(x.right), then there's a node in x's right subtree that's not larger than x, meaning that the tree rooted at x isn't a BST.
Now, it's not necessarily the case that any node where x < min(x.right) and x > max(x.left) is the root of a BST. Consider this tree, for example:
4
/ \
1 6
/ \
2 5
Here, the root node is larger than everything in its left subtree and smaller than everything in its right subtree, but the entire tree is itself not a BST. The reason for this is that the trees rooted at 1 and 6 aren't BSTs. This leads to a useful observation:
Observation 2: If x > max(x.left) and x < min(x.right), then x is a BST if and only if x.left and x.right are BSTs.
A quick sketch of a proof of this result: if x.left and x.right are BSTs, then doing an inorder traversal of the tree will list off all the values in x.left in ascending order, then x, then all the values in x.right in ascending order. Since x > max(x.left) and x < min(x.right), these values are sorted, so the tree is a BST. On the other hand, if either x.left or x.right are not BSTs, then the order in which these values come back won't be sorted, so the tree isn't a BST.
These two properties give a really nice way to find every node in the tree that isn't the root of a BST. The idea is to work through the nodes in the tree from the leaves upward, checking whether each node's value is greater than the max of its left subtree and less than the min of its right subtree, then checking whether its left and right subtrees are BSTs. You can do this with a postorder traversal, as shown here:
/* Does a postorder traversal of the tree, tagging each node with its
* subtree min, subtree max, and whether the node is the root of a
* BST.
*/
function findNonBSTs(r) {
/* Edge case for an empty tree. */
if (r is null) return;
/* Process children - this is a postorder traversal. This also
* tags each child with information about its min and max values
* and whether it's a BST.
*/
findNonBSTs(r.left);
findNonBSTs(r.right);
/* If either subtree isn't a BST, we're done. */
if ((r.left != null && !r.left.isBST) ||
(r.right != null && !r.right.isBST)) {
r.isBST = false;
return;
}
/* Otherwise, both children are BSTs. Check against the min and
* max values of those subtrees to make sure we're in range.
*/
if ((r.left != null && r.left.max >= r.value) ||
(r.right != null && r.right.min <= r.value)) {
r.isBST = false;
return;
}
/* Otherwise, we're a BST, and our min and max value can be
* computed from the left and right children.
*/
r.isBST = true;
r.min = (r.left != null? r.left.min : r.value);
r.max = (r.right != null? r.right.max : r.value);
}
One you've run this pass over the tree, every node will be tagged with whether it's a binary search tree or not. From there, all you have to do is make one more pass over the tree to find the deepest node that's not a BST. I'll leave that as a proverbial exercise for the reader. :-)
Hope this helps!
In BST, according to Programming Interviews Exposed
"Given a node, you can even find the next highest node in O(log(n)) time" Pg 65
A node in BST has right child as the next highest node, then why O(log(n))? Please correct
First answer the question, then negate it
With regard to your comment "A node in BST has right child as the next highest node" (assuming here "next highest" means the next sequential value) - no, it doesn't.
That can be the case if the right child has no left sub-tree, but it's not always so.
The next sequential value (I'm using that term rather than "highest" since the latter could be confused with tree height and "largest" implies a specific (low-to-high) order rather than any order) value comes from one of two places.
First, if the current node has a right child, move to that right child then, as long as you can see a left child, move to it.
In other words, with S and D as the source (current) and destination (next largest):
S
/ \
x x <- This is the node your explanation chose,
/ \ but it's the wrong one in this case.
x x
/
D <----- This is the actual node you want.
\
x
Otherwise (i.e., if the current node has no right child), you need to move up to the parent continuously (so nodes need a right, left and parent pointer) until the node you moved from was a left child. If you get to the root and you still haven't moved up from a left child, your original node was already the highest in the tree.
Graphically that entire process is illustrated with:
x
\
D <- Walking up the tree until you came up
/ \ from a left node.
x x
\
x
/ \
x S
/
x
The pseudo-code for such a function (that covers both those cases) would be:
def getNextNode (node):
# Case 1: right once then left many.
if node.right != NULL:
node = node.right
while node.left != NULL:
node = node.left
return node
# Case 2: up until we come from left.
while node.parent != NULL:
if node.parent.left == node:
return node.parent
node = node.parent
# Case 3: we never came from left, no next node.
return NULL
Since the effort is proportional to the height of the tree (we either go down, or up then down), a balanced tree will have a time complexity of O(log N) since the height has a logN relationship to the number of items.
The book is talking about balanced trees here, because it includes such snippets about them as:
This lookup is a fast operation because you eliminate half the nodes from your search on each iteration.
Lookup is an O(log(n)) operation in a binary search tree.
Lookup is only O(log(n)) if you can guarantee that the number of nodes remaining to be searched will be halved or nearly halved on each iteration.
So, while it admits in that last quote that a BST may not be balanced, the O(log N) property is only for those variants that are.
For non-balanced trees, the complexity (worst case) would be O(n) as you could end up with degenerate trees like:
S D
\ /
x x
\ \
x x
\ \
x x
\ \
x x
/ \
D S
I think, We can find the next highest node by simply finding the Inorder Successor of the node.
Steps -
Firstly, go to the right child of the node.
Then move as left as possible. When you reach the leaf node, print that leaf node as that node is your next highest node compared to the given node.
Here is my pseudo implementation in Java. Hope it helps.
Structure of Node
public Class Node{
int value {get, set};
Node leftChild {get,set};
Node rightChild{get, set};
Node parent{get,set};
}
Function to find next highest node
public Node findNextBiggest(Node n){
Node temp=n;
if(n.getRightChild()==null)
{
while(n.getParent()!=null && temp.getValue()>n.getParent().getValue())
{
n=n.getParent():
}
return n.getParent();
}
else
{
n=n.getRightChild();
while (n.getLeftChild()!=null && n.getLeftChild().getValue()>temp.getValue())
{
n=n.getLeftChild();
}
return n;
}
}