What it means "binary tree with pre-, post-, and in-order numbers assigned to the nodes"? - binary-tree

So, my questions comes from a exercise of the book "Open Data Structures":
"Suppose we are given a binary tree with pre-, post-, and
in-order numbers assigned to the nodes. Show how these numbers can be
used to answer each of the following questions in constant time:".
and i need to this exercise: "Given a node u, determine the depth of u"
Can someone explain what this means "Suppose we are given a binary tree with pre-, post-, and
in-order numbers assigned to the nodes" ( or with a example ) and give a hint to do the exercise?

Related

Calculating the number of 2-3 trees if the number of nodes is given

I'm trying to find out to get count of available trees' number if the number of data are given..
ex) If there are 8 different data, How many tree can be made?
The question is not totally clear. I'll assume you want the number of full binary (resp. ternary) trees with exactly n leaves (if you're rather looking for n nodes in total, you'll be able to find it from my result; if you want the number of trees that can store a given set of n different data, i.e. take into account all the ways to store your data in such a tree, then you should also be able to get it easily).
Let's consider the case of full binary trees. If you want n leaves in it, you can call k the number of leaves in the left subtree, you'll have n-k on the right. These subtrees have N(k) and N(n-k) possibilities respectively,
Then you have N(n)=sum(N(k)*N(n-k), k=1..n-1). With N(1)=1.
This is the (a?) definition of Catalan Numbers: https://en.wikipedia.org/wiki/Catalan_number
N(n)=C(n-1)
You can do something similar for ternary trees

Nearest node in a tree

Recent I encountered this problem on trees whose solution I found in O(n*q) . I am thinking if there is much better way to deal this with lesser complexity.
The problem is here as follows :
Given an unweighted tree of 'n' nodes ( n>=1 and n can go to 105 ) , Its nodes can be special or non special. Node 1 is always special and rest non special initially. Now ,There are two operations :
1.we can update any non special node to special node by an update operation by "U Node_Number"
OR
2.At any time , we can ask user "Q Node_Number" which should return that special node in tree closest to "Node_Number".
These operations can also go upto 105.
My Solution :
I thought of creating adjacency list. For operation 1, I can keep record of special or Non special by boolean flag. But for operation 2 , my solution comprises of doing BFS whenever "Q Node_Number" is asked taking "Node_Number" as root to begin my BFS.
But complexity is quadratic. Is this the most optimal way of going about this problem ?
Here's an O(n^1.5 + n^0.5 q)-time algorithm via a sqrt decomposition. We need a constant-time distance oracle (this is basically least common ancestors). The idea is, every n^0.5 times a node is made special, perform a breadth-first search from all special nodes, which yields for each node in the tree the closest node that is currently special. On each query, take the closest of (i) the nodes that were special as of the last breadth-first search (ii) the at most n^0.5 newly special nodes.
As I mentioned in the comments, I expect that there's a very complicated O((n + q) log n)-time algorithm via top trees.

Determine structural equivalence of BSTs with standard traversal

Is it possible to decide the structural equivalence of two binary search trees just with the results of traversals, pre order, in order and post order. Assume I have only the result arrays of all the traversals. I know in order traversal alone, can't help. But, I couldn't visualize for other traversal results. I understand BFS helps. I want to know for Pre and Post order traversals. And if possible, please post any links on this.
The answer is : you can recover a binary search tree from its pre order traversal.
I'm not sure what is your mathematical background so please ask if you need more explanation.
For simplicity, I'm assuming that the node are labeled by the integer 1,2... n where n is the number of node. Then the pre-order traversal of the tree t gives gives you a permutation of [n] = {1,2,...,n} which have a particular property: each time you have a letter b in your permutation, you can't find two consecutive letter ca after the b in the permutation such that a<b<c. Such a permutation is said to avoid the pattern b-ac (the - stand for an arbitrary number of letters).
For example, 4 2 1 3 avoids b-ac whereas 3 1 4 2 doesn't because 3 - 4 2.
This is actually an equivalence: A permutation is the pre order reading of a tree iff is avoid b-ac.
It is know that there are as many trees of size n as permutation avoiding b-ac so this is a bijection. Their number are know as Catalan number. You probably can find this as an exercice of Stanley's book "enumerative combinatorics".
Here is a more algorithmic explanation:
RecTree: Recovering a tree from is Pre-order traversal:
input: list l
output: tree t
b <- l[0]
find an index i such that
- for 1<=j<=i then l[j] < b and
- for i<j<=n then l[j] > b
if there isn't exists such an index return Failure
else return Node(key=b, RecTree(l[1..i]), RecTree(l[i+1..n]))
As a consequence
Two binary search trees are equal if and only if they have the same pre order traversal
Does it makes sense to you ?
Some more references
Catalan number in the On-Line Encyclopedia of Integer Sequences
Anders Claesson Generalized Pattern Avoidance, European Journal of Combinatorics 22 (2001) 961–971
In a BST you can go left child (L), right child (R) or up (U). A traversal can then be described by a string over {L, R, U}, eg. "LLURUURLURUU". For BSTs with equivalent structures, these strings will be identical.

Finding closest number in a range

I thought a problem which is as follows:
We have an array A of integers of size n, and we have test cases t and in every test cases we are given a number m and a range [s,e] i.e. we are given s and e and we have to find the closest number of m in the range of that array(A[s]-A[e]).
You may assume array indexed are from 1 to n.
For example:
A = {5, 12, 9, 18, 19}
m = 13
s = 4 and e = 5
So the answer should be 18.
Constraints:
n<=10^5
t<=n
All I can thought is an O(n) solution for every test case, and I think a better solution exists.
This is a rough sketch:
Create a segment tree from the data. At each node, besides the usual data like left and right indices, you also store the numbers found in the sub-tree rooted at that node, stored in sorted order. You can achieve this when you construct the segment tree in bottom-up order. In the node just above the leaf, you store the two leaf values in sorted order. In an intermediate node, you keep the numbers in the left child, and right child, which you can merge together using standard merging. There are O(n) nodes in the tree, and keeping this data should take overall O(nlog(n)).
Once you have this tree, for every query, walk down the path till you reach the appropriate node(s) in the given range ([s, e]). As the tutorial shows, one or more different nodes would combine to form the given range. As the tree depth is O(log(n)), that is the time per query to reach these nodes. Each query should be O(log(n)). For all the nodes which lie completely inside the range, find the closest number using binary search in the sorted array stored in those nodes. Again, O(log(n)). Find the closest among all these, and that is the answer. Thus, you can answer each query in O(log(n)) time.
The tutorial I link to contains other data structures, such as sparse table, which are easier to implement, and should give O(sqrt(n)) per query. But I haven't thought much about this.
sort the array and do binary search . complexity : o(nlogn + logn *t )
I'm fairly sure no faster solution exists. A slight variation of your problem is:
There is no array A, but each test case contains an unsorted array of numbers to search. (The array slice of A from s to e).
In that case, there is clearly no better way than a linear search for each test case.
Now, in what way is your original problem more specific than the variation above? The only added information is that all the slices come from the same array. I don't think that this additional constraint can be used for an algorithmic speedup.
EDIT: I stand corrected. The segment tree data structure should work.

What is the advantage of a full binary tree for Huffman code?

I am studying Huffman code for bit encoding a stream of characters and read that an optimal code would be represented by a full binary tree where each distinct character is represented by a leaf and all internal nodes contain exactly two children .
I want to know why the full binary tree is the optimal choice here ? In other words what is the advantage of full binary tree here ?
This is not a choice, but rather equivalence.
Optimal Huffman codes are decoded by a finite state machine, in which
each state has exactly two exits (the next bit being 0 or 1)
each state has exactly one entry
all states containing output symbols are stop states, and
all stop states contain output symbols
This is equivalent to a search tree where
all internal nodes have exactly two children
all nodes have exactly one parent
all nodes containing output symbols are leaf nodes, and
all leaf nodes contain output symbols
There are non-optimal Huffman codes as well, which have stop states / leaf nodes that do not contain output symbols. Such a binary tree would not be full.
Proof by contradiction:
Let us say that the tree T is not a full binary tree which provides optimal Huffman codes for the given characters and their frequencies. As T is not a full binary tree, there exists a node N which has only one child C.
Let us construct a new binary tree T' by replacing N with C. Depth of leaf nodes of C are reduced by 1 in T' compared to tree T. So T' provides a better solution that T, which proves that T is not optimal.
T T'
/\ /\
. N . C
. / .
. C .
You asked why a full binary tree. That is actually three questions.
If you're asking about "full", then it must be full for any correctly generated Huffman code.
If you're asking about "binary", every encountered bit in a Huffman code has two possibilities, 0 or 1, so each node must have two branches.
If however you're asking about "tree", you do not need to represent the code as a tree at all. There are many representations that not only represent the code completely, but also that facilitate both a shorter representation in the compressed stream and faster decoding, than a tree would.
Examples are using a canonical Huffman code, and representing it simply as the counts of symbols at each bit length, and a list of corresponding symbols. This is used in the puff.c code. Or you can generate a set of tables that decode several bits at a time in stages, which is used in zlib's inflate. There are others.

Resources