What is the optimal way to find common nodes in two BST - data-structures

I am looking for the most optimal in terms of complexity(space and time).
My approach until now is:
Traverse one tree in-order and for each nodeId, search that nodeId in second tree.
Node structure:
struct node{
long long nodeId;
node *left;
node *right;
};
Please let me know if any doubts about the question.

Assuming one tree is n long and the other is m long, your approach is O(n*m) long, while you could do it in O(n+m).
Let us say you have to pointers to where you are in each tree (location_1 and location_2). Start going through both trees at the same time, and each time decide which pointer you want to advance. Lets say location_1 is pointing at a node with nodeId = 4 and location_2 is at nodeId = 7. In this case location_1 moves forwards since any node before the one it is looking at has a smaller value then 4.
By "moves forwards" I mean an in-order traverse of the tree.

You can do this with a standard merge in O(n+m) time. The general algorithm is:
n1 = Tree1.Root
n2 = Tree2.Root
while (n1 != NULL && n2 != NULL)
{
if (n1.Value == n2.Value)
{
// node exists in both trees. Advance to next node.
n1 = GetInorderSuccessor(n1)
n2 = GetInorderSuccessor(n2)
}
else if (n1.Value > n2.Value)
{
// n2 is not in Tree1
n2 = GetInorderSuccessor(n2)
}
else
{
// n1 is smaller than n2
// n1 is not in Tree2
n1 = GetInorderSuccessor(n1)
}
}
// At this point, you're at the end of one of the trees
// The other tree has nodes that are not in the other.
while (n1 != NULL)
{
// n1 is not in Tree2. Output it.
// and then get the next node.
n1 = GetInorderSuccessor(n1)
}
while (n2 != NULL)
{
// n2 is not in Tree1. Output it.
// and then get the next node.
n2 = GetInorderSuccessor(n2)
}
The only hard part here is that GetInorderSuccessor call. If you're doing this with a tree, then you'll need to maintain state through successive calls to that function. You can't depend on a standard recursive tree traversal to do it for you. Basically, you need a tree iterator.
The other options is to first traverse each tree to make a list of the nodes in order, and then write the merge to work work with those lists.

Slightly modified Morris Traversal should solve the solution with movement similar to merge part of merge sort. Time Complexity: O(m + n), Space Complexity: O(1).

Related

Split a Binary Tree using specific methods

Given a binary tree, I have to return a tree containing all elements that smaller than k, greater than k and a tree containing only one element - k.
Allowed methods to use:
remove node - O(n)
insert - O(n)
find - O(n)
find min - O(n)
I'm assuming these methods complexity, because in the exercise it's not written that tree is balanced.
Required complexity - O(n)
Original tree have to maintain its structure.
I'm completely stuck. Any help is much appreciated!
Given tree is Binary search tree as well as outputs should be binary search trees.
I see no way to design a O(n) algorithm with the given blackbox functions and their time complexities, given that they could only be called a (maximum) constant number of times (like 3 times) to stay within the O(n) constraint.
But if it is allowed to access and create BSTs with basic, standard node manipulations (traversing via left or right child, setting the left or right child to a given subtree), then you could do the following:
Create three new empty BSTs that will be populated and returned. Name them left, mid, and right, where the first one will have all values less than k, the second one will have at the most one node (with value k), and the final one will have all the rest.
While populating left and right, maintain references to the nodes that are closest to value k: in left that will be the node with the greatest value, and in right the node with the least value.
Follow these steps:
Apply the usual binary search to walk from the root towards the node with value k
While doing this: whenever you choose the left child of a node, the node itself and its right subtree then belong in right. However, the left child should at this moment not be included, so create a new node that copies the current node, but without its left child. Maintain a reference to the node with the least value in right, as that is the node that may get a left new subtree when this step occurs more than once.
Do the similar thing for when you choose the right child of a node.
When the node with k is found, the algorithm can add its left subtree to left and the right subtree to right, and create the single-node tree with value k.
Time complexity
The search towards the node with value k could take O(n) in the worst case, as the BST is not given to be balanced. All the other actions (adding a subtree to a specific node in one of the new BSTs) run in constant time, so in total they are executed O(n) times in the worst case.
If the given BST is balanced (not necessarily perfectly, but like with AVL rules), then the algorithm runs in O(logn) time. However, the output BSTs may not be as balanced, and may violate AVL rules so that rotations would be needed.
Example Implementation
Here is an implementation in JavaScript. When you run this snippet, a test case will run one a BST that has nodes with values 0..19 (inserted in random order) and k=10. The output will iterate the three created BSTs in in-order, so to verify that they output 0..9, 10, and 11..19 respectively:
class Node {
constructor(value, left=null, right=null) {
this.value = value;
this.left = left;
this.right = right;
}
insert(value) { // Insert as a leaf, maintaining the BST property
if (value < this.value) {
if (this.left !== null) {
return this.left.insert(value);
}
this.left = new Node(value);
return this.left;
} else {
if (this.right !== null) {
return this.right.insert(value);
}
this.right = new Node(value);
return this.right;
}
}
// Utility function to iterate the BST values in in-order sequence
* [Symbol.iterator]() {
if (this.left !== null) yield * this.left;
yield this.value;
if (this.right !== null) yield * this.right;
}
}
// The main algorithm
function splitInThree(root, k) {
let node = root;
// Variables for the roots of the trees to return:
let left = null;
let mid = null;
let right = null;
// Reference to the nodes that are lexically closest to k:
let next = null;
let prev = null;
while (node !== null) {
// Create a copy of the current node
newNode = new Node(node.value);
if (k < node.value) {
// All nodes at the right go with it, but it gets no left child at this stage
newNode.right = node.right;
// Merge this with the tree we are creating for nodes with value > k
if (right === null) {
right = newNode;
} else {
next.left = newNode;
}
next = newNode;
node = node.left;
} else if (k > node.value) {
// All nodes at the left go with it, but it gets no right child at this stage
newNode.left = node.left;
// Merge this with the tree we are creating for nodes with value < k
if (left === null) {
left = newNode;
} else {
prev.right = newNode;
}
prev = newNode;
node = node.right;
} else {
// Create the root-only tree for k
mid = newNode;
// The left subtree belongs in the left tree
if (left === null) {
left = node.left;
} else {
prev.right = node.left;
}
// ...and the right subtree in the right tree
if (right === null) {
right = node.right;
} else {
next.left = node.right;
}
// All nodes have been allocated to a target tree
break;
}
}
// return the three new trees:
return [left, mid, right];
}
// === Test code for the algorithm ===
// Utility function
function shuffled(a) {
for (let i = a.length - 1; i > 0; i--) {
const j = Math.floor(Math.random() * (i + 1));
[a[i], a[j]] = [a[j], a[i]];
}
return a;
}
// Create a shuffled array of the integers 0...19
let arr = shuffled([...Array(20).keys()]);
// Insert these values into a new BST:
let root = new Node(arr.pop());
for (let val of arr) root.insert(val);
// Apply the algorithm with k=10
let [left, mid, right] = splitInThree(root, 10);
// Print out the values from the three BSTs:
console.log(...left); // 0..9
console.log(...mid); // 10
console.log(...right); // 11..19
Essentially, your goal is to create a valid BST where k is the root node; in this case, the left subtree is a BST containing all elements less than k, and the right subtree is a BST containing all elements greater than k.
This can be achieved by a series of tree rotations:
First, do an O(n) search for the node of value k, building a stack of its ancestors up to the root node.
While there are any remaining ancestors, pop one from the stack, and perform a tree rotation making k the parent of this ancestor.
Each rotation takes O(1) time, so this algorithm terminates in O(n) time, because there are at most O(n) ancestors. In a balanced tree, the algorithm takes O(log n) time, although the result is not a balanced tree.
In your question you write that "insert" and "remove" operations take O(n) time, but that this is your assumption, i.e. it is not stated in the question that these operations take O(n) time. If you are operating only nodes you already have pointers to, then basic operations take O(1) time.
If it is required not to destroy the original tree, then you can begin by making a copy of it in O(n) time.
I really don't see a simple and efficient way to split with the operations that you mention. But I think that achieving a very efficient split is relatively easy.
If the tree were balanced, then you can perform your split in O(log n) if you define a special operation called join exclusive. Let me first define join_ex() as the operation in question:
Node * join_exclusive(Node *& ts, Node *& tg)
{
if (ts == NULL)
return tg;
if (tg == NULL)
return ts;
tg=.llink) = join_exclusive(ts->rlink, tg->llink);
ts->rlink = tg;
Node * ret_val = ts;
ts = tg = NULL; // empty the trees
return ret_val;
}
join_ex() assumes that you want to build a new tree from two BST ts and tr such that every key in ts is less than everyone in tr.
If you have two exclusive trees T< and T>:
Then join_ex() can be seen as follows:
Note that if you take any node for any BST, then its subtrees meet this condition; every key in the left subtree is less than everyone in the right one. You can design a nice deletion algorithm based on join_ex().
Now we are ready for the split operation:
void split_key_rec(Node * root, const key_type & key, Node *& ts, Node *& tg)
{
if (root == NULL)
{
ts = tg = NULL;
return;
}
if (key < root->key)
{
split_key_rec(root->llink, key, ts, root->llink);
tg = root;
}
else
{
split_key_rec(root->rlink, key, root->rlink, tg)
ts = root;
}
}
If you set root as T in this figure
Then a pictorial representation of split can be seen thus:
split_key_rec() splits the tree into two trees ts and tg according to a key k. At the end of the operation, ts contains a BST with keys less than k and tg is a BST with keys greater or equal than k.
Now, to complete your requirement, you call split_key_rec(t, k, ts, tg) and you get in ts a BST with all the keys less than k. Almost symmetrically, you get in tg a BST with all the keys greater or equal than k. So, the last thing is to verify if the root of tg is k and, if this is the case, you unlink, and you get your result in ts, k, and tg' (tg' is the tree without k).
If k is in the original tree, then the root of tg will be k, and tg won't have left subtree.

Alloy : pre-order traversal of binary tree in required ordering

I want a "pre-order" traversal of the nodes in BinaryTree visits them in
this order [N0,N1,N2,N3]
What should I do with the following structure?
one sig Ordering { // model a linear order on nodes
first: Node, // the first node in the linear order
order: Node -> Node // for each node n, n.(Ordering.order) represents the
// node (if any) immediately after n in order
}
fact LinearOrder { // the first node in the linear order is N0; and
// the four nodes are ordered as [N0, N1, N2, N3]
}
pred SymmetryBreaking(t: BinaryTree) { // if t has a root node, it is the
//first node according to the linear order; and
// a "pre-order" traversal of the nodes in t visits them according
//to the linear order
}
Your question has 2 part. First, define ordering for N0, N1....... Second, define symmetry break using Linear ordering.
First, you can define ordering by list all the valid relationship, like
`N0.(Ordering.order) = N1`
Second, define the predicate that pre-order of tree follow linear ordering. Basically, there are 2 cases, and the first one is trivial , no t.root . However when the root is not NULL, the tree must has following 3 property.
For all node n, if h = n.left, h.val = n.val.(Ordering.order)
For all node n, if h = n.right, h.val = (one node in n's left-sub-tree or n).(Ordering.order)
t.root = N0
If you translate the whole description into alloy, it will be something like
no t.root or
{
t.root = N0 // 3
all h : t.root.^(left+right) | one n : t.root.*(left+right) |
(n.left = h and n.(Ordering.order) = h ) or //1
( one l :( n.left*(left+right) + n)| n.right = h and l.(Ordering.order) = h)//2
}
Note : there may be many alternative solution, and this one is definitely not a simple one.

Finding the left-most child for every node in a tree in linear time?

A paper I am reading claims that
It is easy to see that there is a linear time algorithm to compute the function l()
where l() gives the left-most child (both input and output are in postorder traversal of the tree). However, I can only think of a naive O(n^2) implementation where n is the number of nodes in the tree.
As an example, consider the following tree:
a
/ \
c b
In postorder traversal, the tree is c b a. The corresponding function l() should give c b c.
Here is my implementation in O(n^2) time.
public Object[] computeFunctionL(){
ArrayList<String> l= new ArrayList<String>();
l= l(this, l);
return l.toArray();
}
private ArrayList<String> l(Node currentRoot, ArrayList<String> l){
for (int i= 0; i < currentRoot.children.size(); i++){
l= l(currentRoot.children.get(i), l);
}
while(currentRoot.children.size() != 0){
currentRoot= currentRoot.children.get(0);
}
l.add(currentRoot.label);
return l;
}
The tree is made as:
public class Node {
private String label;
private ArrayList<Node> children= new ArrayList<Node>();
...
There is a simple recursive algorithm you can use that can compute this information in O(1) time per node. Since there are n total nodes, this would run in O(n) total time.
The basic idea is the following recursive insight:
For any node n with no left child, l(n) = n.
Otherwise, if n has left child L, then l(n) = l(L).
This gives rise to this recursive algorithm, which annotates each node with its l value:
function computeL(node n) {
if n is null, return.
computeL(n.left)
computeL(n.right)
if n has no left child:
set n.l = n
else
set n.l = n.left.l
Hope this helps!
You can find l() for the entire tree in less than O(n^2) time. The idea is to traverse the tree in order, maintaing a stack of the nodes you've visited while traversing the left branch. When you get to a leaf, that is the leftmost node for the entire branch.
Here's an example:
class BTreeNode
{
public readonly int Value;
public BTreeNode LeftChild { get; private set; }
public BTreeNode RightChild { get; private set; }
}
void ShowLeftmost(BTreeNode node, Stack<int> stack)
{
if (node.LeftChild == null)
{
// this is the leftmost node of every node on the stack
while (stack.Count > 0)
{
var v = stack.Pop();
Console.WriteLine("Leftmost node of {0} is {1}", v, node.Value);
}
}
else
{
// push this value onto the stack so that
// we can add its leftmost node when we find it.
stack.Push(node.Value);
ShowLeftmost(node.LeftChild, stack);
}
if (node.RightChild != null)
ShowLeftmost(node.RightChild, stack);
}
The complexity is clearly not O(n^2). Rather, it's O(n).
It takes O(n) to traverse the tree. No node is placed on the stack more than once. The worst case for this algorithm is a tree that contains all left nodes. In that case it's O(n) to traverse the tree and O(n) to enumerate the stack. The best case is a tree that contains all right nodes, in which case there is never any stack to enumerate.
So O(n) time complexity, with O(n) worst case extra space for the stack.
Take a look at section 3.1:
3.1. Notation. Let T[i] be the ith node in the tree according to the left-to-right
postorder numbering, l(i) is the number of the leftmost leaf descendant of the subtree
rooted at T[i].
Given that sentence about notation, I would assume that the function l() is referring to finding a single node in linear time.
There may be a more elegant (better than O(n^2)) way of finding l() for an entire tree but I think it's referring to a single node.

How to finding first common ancestor of a node in a binary tree?

Following is my algorithm to find first common ancestor. But I don’t know how to calculate it time complexity, can anyone help?
public Tree commonAncestor(Tree root, Tree p, Tree q) {
if (covers(root.left, p) && covers(root.left, q))
return commonAncestor(root.left, p, q);
if (covers(root.right, p) && covers(root.right, q))
return commonAncestor(root.right, p, q);
return root;
}
private boolean covers(Tree root, Tree p) { /* is p a child of root? */
if (root == null) return false;
if (root == p) return true;
return covers(root.left, p) || covers(root.right, p);
}
Ok, so let's start by identifying what the worst case for this algorithm would be. covers searches the tree from left to right, so you get the worst-case behavior if the node you are searching for is the rightmost leaf, or it is not in the subtree at all. At this point you will have visited all the nodes in the subtree, so covers is O(n), where n is the number of nodes in the tree.
Similarly, commonAncestor exhibits worst-case behavior when the first common ancestor of p and q is deep down to the right in the tree. In this case, it will first call covers twice, getting the worst time behavior in both cases. It will then call itself again on the right subtree, which in the case of a balanced tree is of size n/2.
Assuming the tree is balanced, we can describe the run time by the recurrence relation T(n) = T(n/2) + O(n). Using the master theorem, we get the answer T(n) = O(n) for a balanced tree.
Now, if the tree is not balanced, we might in the worst case only reduce the size of the subtree by 1 for each recursive call, yielding the recurrence T(n) = T(n-1) + O(n). The solution to this recurrence is T(n) = O(n^2).
You can do better than this, though.
For example, instead of simply determining which subtree contains p or q with cover, let's determine the entire path to p and q. This takes O(n) just like cover, we're just keeping more information. Now, traverse those paths in parallell and stop where they diverge. This is always O(n).
If you have pointers from each node to their parent you can even improve on this by generating the paths "bottom-up", giving you O(log n) for a balanced tree.
Note that this is a space-time tradeoff, as while your code takes O(1) space, this algorithm takes O(log n) space for a balanced tree, and O(n) space in general.
As hammar’s answer demonstrates, your algorithm is quite inefficient as many operations are repeated.
I would do a different approach: Instead of testing for every potential root node if the two given nodes are not in the same sub-tree (thus making it the first common ancestor) I would determine the the paths from the root to the two given nodes and compare the nodes. The last common node on the paths from the root downwards is then also the first common ancestor.
Here’s an (untested) implementation in Java:
private List<Tree> pathToNode(Tree root, Tree node) {
List<Tree> path = new LinkedList<Tree>(), tmp;
// root is wanted node
if (root == node) return path;
// check if left child of root is wanted node
if (root.left == node) {
path.add(node);
path.add(root.left);
return path;
}
// check if right child of root is wanted node
if (root.right == node) {
path.add(node);
path.add(root.right);
return path;
}
// find path to node in left sub-tree
tmp = pathToNode(root.left, node);
if (tmp != null && tmp.size() > 1) {
// path to node found; add result of recursion to current path
path = tmp;
path.add(0, node);
return path;
}
// find path to node in right sub-tree
tmp = pathToNode(root.right, node);
if (tmp != null && tmp.size() > 1) {
// path to node found; add result of recursion to current path
path = tmp;
path.add(0, node);
return path;
}
return null;
}
public Tree commonAncestor(Tree root, Tree p, Tree q) {
List<Tree> pathToP = pathToNode(root, p),
pathToQ = pathToNode(root, q);
// check whether both paths exist
if (pathToP == null || pathToQ == null) return null;
// walk both paths in parallel until the nodes differ
while (iterP.hasNext() && iterQ.hasNext() && iterP.next() == iterQ.next());
// return the previous matching node
return iterP.previous();
}
Both pathToNode and commonAncestor are in O(n).

Generating uniformly random curious binary trees

A binary tree of N nodes is 'curious' if it is a binary tree whose node values are 1, 2, ..,N and which satisfy the property that
Each internal node of the tree has exactly one descendant which is greater than it.
Every number in 1,2, ..., N appears in the tree exactly once.
Example of a curious binary tree
4
/ \
5 2
/ \
1 3
Can you give an algorithm to generate a uniformly random curious binary tree of n nodes, which runs in O(n) guaranteed time?
Assume you only have access to a random number generator which can give you a (uniformly distributed) random number in the range [1, k] for any 1 <= k <= n. Assume the generator runs in O(1).
I would like to see an O(nlogn) time solution too.
Please follow the usual definition of labelled binary trees being distinct, to consider distinct curious binary trees.
There is a bijection between "curious" binary trees and standard heaps. Namely, given a heap, recursively (starting from the top) swap each internal node with its largest child. And, as I learned in StackOverflow not long ago, a heap is equivalent to a permutation of 1,2,...,N. So you should make a random permutation and turn it into a heap; or recursively make the heap in the same way that you would have made a random permutation. After that you can convert the heap to a "curious tree".
Aha, I think I've got how to create a random heap in O(N) time. (after which, use approach in Greg Kuperberg's answer to transform into "curious" binary tree.)
edit 2: Rough pseudocode for making a random min-heap directly. Max-heap is identical except the values inserted into the heap are in reverse numerical order.
struct Node {
Node left, right;
Object key;
constructor newNode() {
N = new Node;
N.left = N.right = null;
N.key = null;
}
}
function create-random-heap(RandomNumberGenerator rng, int N)
{
Node heap = Node.newNode();
// Creates a heap with an "incomplete" node containing a null, and having
// both child nodes as null.
List incompleteHeapNodes = [heap];
// use a vector/array type list to keep track of incomplete heap nodes.
for k = 1:N
{
// loop invariant: incompleteHeapNodes has k members. Order is unimportant.
int m = rng.getRandomNumber(k);
// create a random number between 0 and k-1
Node node = incompleteHeapNodes.get(m);
// pick a random node from the incomplete list,
// make it a complete node with key k.
// It is ok to do so since all of its parent nodes
// have values less than k.
node.left = Node.newNode();
node.right = Node.newNode();
node.key = k;
// Now remove this node from incompleteHeapNodes
// and add its children. (replace node with node.left,
// append node.right)
incompleteHeapNodes.set(m, node.left);
incompleteHeapNodes.append(node.right);
// All operations in this loop take O(1) time.
}
return prune-null-nodes(heap);
}
// get rid of all the incomplete nodes.
function prune-null-nodes(heap)
{
if (heap == null || heap.key == null)
return null;
heap.left = prune-null-nodes(heap.left);
heap.right = prune-null-nodes(heap.right);
}

Resources