Binary search tree intersection - algorithm

I have 2 binary search trees T1 and T2 with same number of nodes n >= 1. For each node P we have LEFT(P) and RIGHT(P) for links between nodes and KEY(P) for value off the node. The root of T1 is R1 and root of T2 is R2.
I need a linear algorithm which will determine values ​​which are found both in T1 and in T2.
My idea until now is to do an inorder traversal of T1 and search in T2 for current element, like this:
inorder(node)
if node is not NULL
inorder(LEFT(node))
if find(KEY(node), R2)
print KEY(node)
inorder(RIGHT(node))
Where find(KEY(node), R2) implement a binary search for KEY(node) in tree T2.
Is this the correct solution? Is this a linear algorithm? (I know traversal is O(n) complexity). Or, there is another method to intersect 2 binary search trees?
Thanks!

Your current inorder traversal using recursion to perform the task. That makes it difficult to run more than one at the same time.
So, first I would re-write the method to use an explicit stack (example here in C#). Now, duplicate all of the state so that we perform traversals of both trees at the same time.
At any point where we're ready to yield a value from both trees, we compare their KEY() values. If they are unequal then we carry on the traversal of the tree with the lower KEY() value.
If both values are equal then we yield that value and continue traversing both trees again.
This is similar in concept to merging two sorted sequences - all we need to do is to examine the "next" value to be yielded by each sequence, yield the lower of the two values and then move forward in that sequence.
In answer to your original proposal:
Is this a linear algorithm?
No. For every node you visit during your inorder traversal, you're calling find which is O(log n). So your complete algorithm is (if I remember complexity correctly) O(n log n).

Related

How can i split an AVL tree at a given node at time O(log(n))?

I've been busting my head trying all kinds of ways but the best I got is O(log^2(n)).
the exact question is:
make a function Split(AVLtree T, int k) which returns 2 AVL trees (like a tuple) such that all values in T1 are lower than or equal to k and the rest are in T2. k is not necessarily in the tree. time must be O(log(n)).
Assume efficient implementation of AVL tree and I managed to make a merge function with time O(log(|h1-h2|)).
Any help would be greatly appriciated.
You're almost there, given that you have the merge function!
Do a regular successor search in the tree for k. This will trace out a path through the tree from the root to that successor node. Imagine cutting every edge traced out on the path this way, which will give you a collection of "pennants," single nodes with legal AVL trees hanging off to the sides. Then, show that if you merge them back together in the right order, the costs of the merges form a telescoping sum that adds up to O(log n).

Does every level order traversal uniquely define a BST?

Suppose I have to compare whether two binary search trees are similar. Now, the basic approach is the recursive formulation that checks for the root to be equal and then continues to check the equality of the corresponding right and left subtrees.
However, will it be correct to state that if the binary search trees have the same level order traversals then they are the same? Stated differently, does every BST have a unique level order traversal?
No, it isn't.
The first one:
1
\
\
2
\
\
3
The second:
1
/ \
/ \
2 3
Level order will give 1 - 2 - 3 for these two.
Since the informational theory lower bound on representing a binary tree with n nodes is 2n - THETA(log n), I don't think any simple traversal should be able to identify a binary tree.
Google search confirms the lower bound:
lower bound bits binary tree
There is a simple reduction from BST to binary tree. Consider the BSTs with nodes value 1..n. The number of these BSTs is the number of binary trees with n nodes (you could always do a pre order traversal and insert the value in that order). If you can use a level order traversal to identify such a BST, you can use 1 for a "in-level" node, 0 for a "end-level" node. The first tree becomes "000", the second one "010". This will let a BST be identified with just n bits, with does not fit the information theory lower bound.
Well , I discussed this question with a friend of mine , so the answer isn't exactly mine! , but here's what came up, the level order traversal you do for a BST can be sorted and thus you can get the inorder traversal of the particular BST. Now you get two traversals which can then be used to uniquely identify the BST. Thus it wouldn't be incorrect to state that every BST has a unique level order traversal.
Algorithm:
ConstructBST(levelorder[] , int Size)
1. Declare array A of size n.
2. Copy levelorder into A
3. Sort A
From two traversals A and levelorder of a Binary Search Tree , of which one is inorder, construct the tree.

Sorting an n element array with O(logn) distinct elements in O(nloglogn) worst case time

The problem at hand is whats in the title itself. That is to give an algorithm which sorts an n element array with O(logn) distinct elements in O(nloglogn) worst case time. Any ideas?
Further how do you generally handle arrays with multiple non distinct elements?
O(log(log(n))) time is enough for you to do a primitive operation in a search tree with O(log(n)) elements.
Thus, maintain a balanced search tree of all the distinct elements you have seen so far. Each node in the tree additionally contains a list of all elements you have seen with that key.
Walk through the input elements one by one. For each element, try to insert it into the tree (which takes O(log log n) time). If you find you've already seen an equal element, just insert it into the auxiliary list in the already-existing node.
After traversing the entire list, walk through the tree in order, concatenating the auxiliary lists. (If you take care to insert in the auxiliary lists at the right ends, this is even a stable sort).
Simple log(N) space solution would be:
find distinct elements using balanced tree (log(n) space, n+log(n) == n time)
Than you can use this this tree to allways pick correct pivot for quicksort.
I wonder if there is log(log(N)) space solution.
Some details about using a tree:
You should be able to use a red black tree (or other type of tree based sorting algorithm) using nodes that hold both a value and a counter: maybe a tuple (n, count).
When you insert a new value you either create a new node or you increment the count of the node with the value you are adding (if a node with that value already exists). If you just increment the counter it will take you O(logH) where H is the height of the tree (to find the node), if you need to create it it will also take O(logH) to create and position the node (the constants are bigger, but it's still O(logH).
This will ensure that the tree will have no more than O(logn) values (because you have log n distinct values). This means that the insertion will take O(loglogn) and you have n insertions, so O(nloglogn).

How to create an AVL Tree from ArrayList of values in O(n) time?

My assignment is to create an AVL Tree from a sorted array list of values in O(n) time where n is the number of values
I have been working on this but I cannot get O(n) time, the best I can get is O(nlog(n))
My problem is that every time a node that causes the tree to be unbalanced is inserted, I have to do another loop to find the node that is unbalanced and apply rotation(s) to balance the tree again.
Any help is greatly appreciated, thanks!
How about just creating a complete balanced tree, with a few nodes at the lowest level possibly missing, e.g., for 6 elements, create
o
/ \
o o
/ \ /
o o o
Then do an inorder walk, and when you visit the i-th node, set its key to A[i].
This is a valid AVL tree, since every node has a left and right child whose heights differ by at most one.
The original tree can be constructed in O(n), and the inorder walk in O(n), so the complexity is O(n).
Incidentally, on a semirelated note, there's a technique called heapify for building a heap (mix or max) out of an array of integers that's O(n) for a length n array, even though the insertion into a heap is O(log n) - the trick is to do it bottom up.
Inserting is O(logn), so true, nobody can do better than O(nlogn) when inserting. Really, you shouldn't insert into the AVL-tree. You should just create the tree. Create all the nodes and pick the values from the array as you are constructing new nodes. Don't find/search the value you need, just take it, the array is sorted.
Given a list that contains 5 elements and is sorted, for example [1,2,3,4,5], what would be the root of the tree? How about for 7 elements? 10? ...?
After you got the root, then what would be the root of the left subtree. What's the list to look at? Which part of the list do you have to store in the left subtree?
That's all.

Median of BST in O(logn) time complexity

I came across solution given at http://discuss.joelonsoftware.com/default.asp?interview.11.780597.8 using Morris InOrder traversal using which we can find the median in O(n) time.
But is it possible to achieve the same using O(logn) time? The same has been asked here - http://www.careercup.com/question?id=192816
If you also maintain the count of the number of left and right descendants of a node, you can do it in O(logN) time, by doing a search for the median position. In fact, you can find the kth largest element in O(logn) time.
Of course, this assumes that the tree is balanced. Maintaining the count does not change the insert/delete complexity.
If the tree is not balanced, then you have Omega(n) worst case complexity.
See: Order Statistic Tree.
btw, BigO and Smallo are very different (your title says Smallo).
Unless you guarantee some sort of balanced tree, it's not possible.
Consider a tree that's completely degenerate -- e.g., every left pointer is NULL (nil, whatever), so each node only has a right child (i.e., for all practical purposes the "tree" is really a singly linked list).
In this case, just accessing the median node (at all) takes linear time -- even if you started out knowing that node N was the median, it would still take N steps to get to that node.
We can find the median by using the rabbit and the turtle pointer. The rabbit moves twice as fast as the turtle in the in-order traversal of the BST. This way when the rabbit reaches the end of traversal, the turtle in at the median of the BST.
Please see the full explanation.

Resources