Binary Search Tree unique orderings? - data-structures

Given an arbitrary set of values V and building a tree by inserting values left to right, what does it mean if I'm asked if my orderings of these values (to construct a minimum height and maximum height tree) are unique?
I've read on the internet it must follow a Hamiltonian Path, but we never learned this. And I'm also not quite sure what a Hamiltonian Path is.
Is there a proof that an ordering I choose is a unique ordering?

I believe (though I'm not fully positive) that the question is asking you whether there are multiple different orders into which you could insert the values into the BST that would produce the same tree.
For example, consider this tree:
1
/ \
0 2
There are two orders in which you could add the values into this tree to produce this result: 1, 0, 2 and 1, 2, 0.
On the other hand, this tree can only be formed in one way:
1
\
2
Namely, you have to insert 1 first, then 2.
Hope this helps!

Related

How to determine if two binary trees are equal or different

The picture below is the case of different binary trees that can be made with 3 nodes.
But why is the following case not included in the number of cases?
Is it the same case as the third case from the left in the picture above? If so, I think the parent-child relationship will be different.
I'd appreciate it if you could tell me how to determine how binary trees are equal to or different from each other.
You mean 3 nodes. Your case is not included because the cases all use A as the root node, so it is easier to demonstrate the different possible combinations using always the same elements in the same order, i.e. A as root, then B -> C on top and symmetrically C -> B at the bottom.
Using B or C as the root, you could achieve the same number of variations. This number can be computed using the following formula:
In this case, n=3, so F(0)*F(2) + F(1)*F(1) + F(2)*F(0) = 2 + 1 + 2 = 5
F(0) = 1 (empty is considered one variation)
F(1) = 1
F(2) = 2
So in your picture, only one row is actually relevant if that is supposed to be a binary search tree. Also note that there is a difference between a binary tree and a binary search tree. From first article below:
As we know, the BST is an ordered data structure that allows no
duplicate values. However, Binary Tree allows values to be repeated
twice or more. Furthermore, Binary Tree is unordered.
References
https://www.baeldung.com/cs/calculate-number-different-bst
https://en.wikipedia.org/wiki/Binary_tree#Using_graph_theory_concepts
https://encyclopediaofmath.org/wiki/Binary_tree

Possible Binary Search Tree and Binary Tree with the following nodes

Possible Binary Trees and Binary SEARCH tree with 3 nodes A,B,C below .
Is it correct?
Close, it's a good attempt.
However, number 3 is not a valid sorted tree since it has A coming after B (BAC). I think instead you should have chosen the final one on the page
A
\
C
/
B
There's also the mirror of that one:
C
/
A
\
B
In terms of search tree, number two is the one you want, since all the others have a differential height greater than one.

Binary Search Trees / Picking a Root

I'm not quite sure how to pick a root for a binary search tree (I'm wanting to do without any code):
5, 9, 2, 1, 4, 8 ,3, 7, 6
How do I pick a root?
The steps are confusing me for this algorithm.
You can initialize an empty BST (binary search tree), then iterate the list and insert each item.
You don't need to pick a root, just build the tree. But maybe you want balanced the tree, you can insert as first element the middle value of the list, but the right answer is to use a balanced binary search tree (AVL tree).
Median number will be a better choice, because you want to have less depth.
Here is one example, the root is find the median the next one is also find the median
5
3 8
2 4 7 9
1 6
5 is get by (1+9)/2. 3 get from ceiling(1+4)/2 (you can also choose the floor of the median as the role of choosing median root)
BST with the same values can have many forms. For example, a tree containing 1,2 can be:
1 <- root
\
2 <-- right son
or
2 <- root
/
1 <-- left son
So you can have a tree where 1 is the root and it goes 1->2->3... and no left sons. You can have 5 as the root with 4 and 6 as left and right sons respectively, and you can have many other trees with the same values, but different ordering (and maybe different roots)
How do I pick a root?
In whichever way you want to. Any number of your data can be the root.
You would like to choose the median though, in this case, 5. With that choice, your tree should get as balanced as it gets, four nodes on the left of 5 and four nodes in the right subtree of 5.
Notice that any element could be the rood (even a random choice, or the first number in your example).
Um, then why should I worried finding the median and not always picking the first number (easiest choice)?
Because you want your Binary Search Tree (BST) to be as balanced as possible.
If you pick the min or the max number as a root, then your tree will reach its maximum depth (worst case scenario), and will emulate a single linked list, which will result in a worst case scenario for the search algorithm as well. However, as Michel stated, picking the minimum or maximum item for the root won't necessarily lead to a degenerate tree. For example, if you picked the minimum item for the root and but the right branch that contains the rest of the items is balanced, then the tree's height is only one level more than optimum. You only get a degenerate tree if you choose the nodes in ascending or descending order.
Keep in mind that in a BST, this rule must be respected:
Left children are less than the parent node and
all right children are greater than the parent node.
For more, read How binary search tree is created??

Efficient algorithm for eliminating nodes in "graph"?

Suppose I have a a graph with 2^N - 1 nodes, numbered 1 to 2^N - 1. Node i "depends on" node j if all the bits in the binary representation of j that are 1, are also 1 in the binary representation of i. So, for instance, if N=3, then node 7 depends on all other nodes. Node 6 depends on nodes 4 and 2.
The problem is eliminating nodes. I can eliminate a node if no other nodes depend on it. No nodes depend on 7; so I can eliminate 7. After eliminating 7, I can eliminate 6, 5, and 3, etc. What I'd like is to find an efficient algorithm for listing all the possible unique elimination paths. (that is, 7-6-5 is the same as 7-5-6, so we only need to list one of the two). I have a dumb algorithm already, but I think there must be a better way.
I have three related questions:
Does this problem have a general name?
What's the best way to solve it?
Is there a general formula for the number of unique elimination paths?
Edit: I should note that a node cannot depend on itself, by definition.
Edit2: Let S = {s_1, s_2, s_3,...,s_m} be the set of all m valid elimination paths. s_i and s_j are "equivalent" (for my purposes) iff the two eliminations s_i and s_j would lead to the same graph after elimination. I suppose to be clearer I could say that what I want is the set of all unique graphs resulting from valid elimination steps.
Edit3: Note that elimination paths may be different lengths. For N=2, the 5 valid elimination paths are (),(3),(3,2),(3,1),(3,2,1). For N=3, there are 19 unique paths.
Edit4: Re: my application - the application is in statistics. Given N factors, there are 2^N - 1 possible terms in statistical model (see http://en.wikipedia.org/wiki/Analysis_of_variance#ANOVA_for_multiple_factors) that can contain the main effects (the factors alone) and various (2,3,... way) interactions between the factors. But an interaction can only be present in a model if all sub-interactions (or main effects) are present. For three factors a, b, and c, for example, the 3 way interaction a:b:c can only be in present if all the constituent two-way interactions (a:b, a:c, b:c) are present (and likewise for the two-ways). Thus, the model a + b + c + a:b + a:b:c would not be allowed. I'm looking for a quick way to generate all valid models.
It seems easier to think about this in terms of sets: you are looking for families of subsets of {1, ..., N} such that for each set in the family also all its subsets are present. Each such family is determined by the inclusion-wise maximal sets, which must be overlapping. Families of pairwise overlapping sets are called Sperner families. So you are looking for Sperner families, plus the union of all the subsets in the family. Possibly known algorithms for enumerating Sperner families or antichains in general are useful; without knowing what you actually want to do with them, it's hard to tell.
Thanks to #FalkHüffner's answer, I saw that what I wanted to do was equivalent to finding monotonic Boolean functions for N arguments. If you look at the figure on the Wikipedia page for Dedekind numbers (http://en.wikipedia.org/wiki/Dedekind_number) the figure expresses the problem graphically. There is an algorithm for generating monotonic Boolean functions (http://www.mathpages.com/home/kmath094.htm) and it is quite simple to construct.
For my purposes, I use the algorithm, then eliminate the first column and last row of the resulting binary arrays. Starting from the top row down, each row has a 1 in the ith column if one can eliminate the ith node.
Thanks!
You can build a "heap", in which at depth X are all the nodes with X zeros in their binary representation.
Then, starting from the bottom layer, connect each item to a random parent at the layer above, until you get a single-component graph.
Note that this graph is a tree, i.e., each node except for the root has exactly one parent.
Then, traverse the tree (starting from the root) and count the total number of paths in it.
UPDATE:
The method above is bad, because you cannot just pick a random parent for a given item - you have a limited number of items from which you can pick a "legal" parent... But I'm leaving this method here for other people to give their opinion (perhaps it is not "that bad").
In any case, why don't you take your graph, extract a spanning-tree (you can use Prim algorithm or Kruskal algorithm for finding a minimal-spanning-tree), and then count the number of paths in it?

Find whether the following tree exists in the list of million of binary search trees

For example,
consider the following tree's, check whether they exist in the list of BST.
5
/ \
4 6
/ \
1 3
3
/ \
2 4
How to approach to this problem?
Sort the list according to the root (if roots are same then left node etc). For each query tree do a binary search.
This works if the number of queries is comparable to number of elements in the list. Complexity: ( (n+m)logn) where m is the number of queries and n is the number of elements in the list.
If the number of queries is small, brute-force searching is efficient.
I'll put it up as an answer so people can make variations if they'd like.
A naive approach would be to just scan through the list, compare each node and once you see a difference in the two trees you're comparing, just go on to the next one in the list. => O(N) where N is the total number of nodes.
The answer to this question was put all the trees of the list in hash table so that there is constant time search for a tree.

Resources