Find whether the following tree exists in the list of million of binary search trees - algorithm

For example,
consider the following tree's, check whether they exist in the list of BST.
5
/ \
4 6
/ \
1 3
3
/ \
2 4
How to approach to this problem?

Sort the list according to the root (if roots are same then left node etc). For each query tree do a binary search.
This works if the number of queries is comparable to number of elements in the list. Complexity: ( (n+m)logn) where m is the number of queries and n is the number of elements in the list.
If the number of queries is small, brute-force searching is efficient.

I'll put it up as an answer so people can make variations if they'd like.
A naive approach would be to just scan through the list, compare each node and once you see a difference in the two trees you're comparing, just go on to the next one in the list. => O(N) where N is the total number of nodes.

The answer to this question was put all the trees of the list in hash table so that there is constant time search for a tree.

Related

Checking for equality of binary search trees constructed from unsorted arrays

This question has already been asked here: http://stackoverflow.com/questions/14092710/equality-of-two-binary-search-trees-constructed-from-unordered-arrays.
However, the solution that I thought of is fairly simple but hasn't been mentioned there.
The problem statement is: Given two arrays which represent a sequence of keys. Imagine we make a Binary Search Tree (BST) from each array. We need to tell whether two BSTs will be identical or not without actually constructing the tree.
Something that I thought of is, just sort the two arrays. If they are identical, then their inorder traversal will also be identical and hence if the two arrays are identical, then they definitely will lead to the same BST. Am I wrong in assuming that if the inorder traversal of two binary search trees is same, then trees will be same too?
Unless I'm misunderstanding what you mean by "inorder traversal", this won't work. Not only are inorder traversals of BSTs not unique, in fact the inorder traversal of every BST on the same set of elements is the same. Here's a small example:
4
/\
2/ \
/\ \
1 3 5
2
/\
/ \4
/ /\
1 3 5
Both trees have the inorder traversal 1, 2, 3, 4, 5. So your approach will report "IDENTICAL" even though the trees are different.
Your approach is actually wrong in the other direction too. If the two BSTs have the same structure but different elements, then you should report "IDENTICAL", but of course their sorted lists ( = inorder traversals) are different -- so in this case you will report "DIFFERENT".
Only inorder traversal can not define a unique BST. You have to get another pre/post order traversal to reconstruct the very BST.

What is the time complexity of the best case to insert a new node into a minimum-level BST with n nodes?

I am learning about algo complexity and calculating time complexity. the question is
What is the time complexity of the best case to insert a new node into a
minimum-level BST with n nodes? Explain. (Hint: You may draw a diagram as part of your
solution.)
Can you please explain in details how you would solve this question and similar questions?
my attempt:
for time complexity we have 2 questions, how many times and what does it cost.
How many times:
there will be one element to check so => O(1)
how much does it cost?
how many times?
now I am stuck here (pretty early), I am assuming that since its a tree, there will be n/2 elements after the first comparison and it keeps splitting into half.
Consider the following minimum-height BST (any binary tree with 8 nodes has at least 4 levels, thus it has the minimum height).
8
/
4
/ \
2 6
/ \ / \
1 3 5 7
Now let's say you insert the value 9, it will go straight to the right side of the root.
To generalize this example: a BST which has a right child or left child which are complete trees- is a minimum height BST. If the other side is empty, any value that you'll insert which will be greater\lesser to the node will be added directly to the root's right\left child. in this case the insert will take O(1) time.

Number of binary search trees over n distinct elements

How many binary search trees can be constructed from n distinct elements? And how can we find a mathematically proved formula for it?
Example:
If we have 3 distinct elements, say 1, 2, 3, there
are 5 binary search trees.
Given n elements, the number of binary search trees that can be made from those elements is given by the nth Catalan number (denoted Cn). This is equal to
Intuitively, the Catalan numbers represent the number of ways that you can create a structure out of n elements that is made in the following way:
Order the elements as 1, 2, 3, ..., n.
Pick one of those elements to use as a pivot element. This splits the remaining elements into two groups - those that come before the element and those that come after.
Recursively build structures out of those two groups.
Combine those two structures together with the one element you removed to get the final structure.
This pattern perfectly matches the ways in which you can build a BST from a set of n elements. Pick one element to use as the root of the tree. All smaller elements must go to the left, and all larger elements must go to the right. From there, you can then build smaller BSTs out of the elements to the left and the right, then fuse them together with the root node to form an overall BST. The number of ways you can do this with n elements is given by Cn, and therefore the number of possible BSTs is given by the nth Catalan number.
Hope this helps!
I am sure this question is not just to count using a mathematical formula.. I took out some time and wrote the program and the explanation or idea behind the calculation for the same.
I tried solving it with recursion and dynamic programming both. Hope this helps.
The formula is already present in the previous answer:
So if you are interested in learning the solution and understanding the apporach you can always check my article Count Binary Search Trees created from N unique elements
Let T(n) be the number of bsts of n elements.
Given n distinct ordered elements, numbered 1 to n, we select i as the root.
This leaves (1..i-1) in the left subtree for T(i-1) combinations and (i+1..n) in the right subtree for T(n-i) combinations.
Therefore:
T(n) = sum_i=1^n(T(i-1) * T(n-i))
and of course T(1) = 1

Finding closest number in a range

I thought a problem which is as follows:
We have an array A of integers of size n, and we have test cases t and in every test cases we are given a number m and a range [s,e] i.e. we are given s and e and we have to find the closest number of m in the range of that array(A[s]-A[e]).
You may assume array indexed are from 1 to n.
For example:
A = {5, 12, 9, 18, 19}
m = 13
s = 4 and e = 5
So the answer should be 18.
Constraints:
n<=10^5
t<=n
All I can thought is an O(n) solution for every test case, and I think a better solution exists.
This is a rough sketch:
Create a segment tree from the data. At each node, besides the usual data like left and right indices, you also store the numbers found in the sub-tree rooted at that node, stored in sorted order. You can achieve this when you construct the segment tree in bottom-up order. In the node just above the leaf, you store the two leaf values in sorted order. In an intermediate node, you keep the numbers in the left child, and right child, which you can merge together using standard merging. There are O(n) nodes in the tree, and keeping this data should take overall O(nlog(n)).
Once you have this tree, for every query, walk down the path till you reach the appropriate node(s) in the given range ([s, e]). As the tutorial shows, one or more different nodes would combine to form the given range. As the tree depth is O(log(n)), that is the time per query to reach these nodes. Each query should be O(log(n)). For all the nodes which lie completely inside the range, find the closest number using binary search in the sorted array stored in those nodes. Again, O(log(n)). Find the closest among all these, and that is the answer. Thus, you can answer each query in O(log(n)) time.
The tutorial I link to contains other data structures, such as sparse table, which are easier to implement, and should give O(sqrt(n)) per query. But I haven't thought much about this.
sort the array and do binary search . complexity : o(nlogn + logn *t )
I'm fairly sure no faster solution exists. A slight variation of your problem is:
There is no array A, but each test case contains an unsorted array of numbers to search. (The array slice of A from s to e).
In that case, there is clearly no better way than a linear search for each test case.
Now, in what way is your original problem more specific than the variation above? The only added information is that all the slices come from the same array. I don't think that this additional constraint can be used for an algorithmic speedup.
EDIT: I stand corrected. The segment tree data structure should work.

Measuring how "out-of-order" an array is

Given an array of values, I want to find the total "score", where the score of each element is the number of elements with a smaller value that occur before it in the array.
e.g.
values: 4 1 3 2 5
scores: 0 0 1 1 4
total score: 6
An O(n^2) algorithm is trivial, but I suspect it may be possible to do it in O(nlgn), by sorting the array. Does anyone have any ideas how to do that, or if it's not possible?
Looks like what you are doing is essentially counting the number of pairs of elements that are in the incorrect relative order (i.e. number of inversions). This can be done in O(n*log(n)) by using the same idea as merge sort. As you merge, you just count the number of elements that are in the left list but should have been on the right list (and vice versa).
If the range of your numbers is small enough, the fastest algorithm I can think of is one that uses Fenwick Trees. Essentially just iterate through the list and query the Fenwick Tree for how many elements are before it, then insert the number into the tree. This will answer your question in O(nlogm), where n is the size of your list and m is your largest integer.
If you don't have a reasonable range on your integers (or you want to conserve space) MAK's solution is pretty damn elegant, so use that :)

Resources