How can you calculate depth of a binary tree with less complexity? - algorithm

Given a binary search tree t, it is rather easy to get its depth using recursion, as the following:
def node_height(t):
if t.left.value == None and t.right.value == None:
return 1
else:
height_left = t.left.node_height()
height_right = t.right.node_height()
return ( 1 + max(height_left,height_right) )
However, I noticed that its complexity increases exponentially, and thus should perform very badly when we have a deep tree. Is there any faster algorithm for doing this?

If you store the height as a field in the Node object, you can add 1 as you add nodes to the tree (and subtracting during remove).
That'll make the operation constant time for getting the height of any node, but it adds some additional complexity into the add/remove operations.

This kind of extends from what #cricket_007 mentioned in his answer.
So, if you do a ( 1 + max(height_left,height_right) ), you end up having to visit every node, which is essentially an O(N) operation. For an average case with a balanced tree, you would be looking at something like T(n) = 2T(n/2) + Θ(1).
Now, this can be improved to a time of O(1) if you can store the height of a certain node. In that case, the height of the tree would be equal to the height of the root. So, the modification you would need to make would be to your insert(value) method. At the beginning, the root is given a default height of 0. The node to be added is assigned a height of 0. For every node you encounter while trying to add this new node, increase node.height by 1 if needed, and ensure it is set to 1 + max(left child's height, right child's height). So, the height function will simply return node.height, hence allowing for constant time. The time complexity for the insert will also not change; we just need some extra space to store n integer values, where n is the number of nodes.
The following is shown to give an understanding of what I am trying to say.
5 [0]
- insert 2 [increase height of root by 1]
5 [1]
/
/
[0] 2
- insert 1 [increase height of node 2 by 1, increase height of node 5 by 1]
5 [2]
/
/
[1] 2
/
/
[0] 1
- insert 3 [new height of node 2 = 1 + max(height of node 1, height of node 3)
= 1 + 0 = 1; height of node 5 also does not change]
5 [2]
/
/
[1] 2
/ \
/ \
[0] 1 3 [0]
- insert 6 [new height of node 5 = 1 + max(height of node 2, height of node 6)
= 1 + 1 = 2]
5 [2]
/ \
/ \
[1] 2 6 [0]
/ \
/ \
[0] 1 3 [0]

Related

Calculating height for AVL tree while inserting node

I have verified in three sources for avl insert code. In all the cases to calculate height,
root.height = 1 + max(self.getHeight(root.left),
self.getHeight(root.right))
the above line is given.
Here is my query, why should we take max of both left and right subtree and add one to that?
What if we are adding the node to the subtree with minimum height? In that case both will have same height H not H+1.
This increment of height should be added as,
elif key < root.key:
root.left = self.insertNode(root.left, key)
root.height = 1 + self.getHeight(root.left)
else:
root.right = self.insertNode(root.right, key)
root.height = 1 + self.getHeight(root.right )
Am I correct? If yes, why these people are adding one after taking max?
Please use the full code for verification below. code is taken from programiz.com. Also verified geek for geeks.
def insertNode(self, root, key):
if not root:
return TreeNode(key)
elif key < root.key:
root.left = self.insertNode(root.left, key)
else:
root.right = self.insertNode(root.right, key)
root.height = 1 + max(self.getHeight(root.left),
self.getHeight(root.right))
balanceFactor = self.getBalance(root)
if balanceFactor > 1:
if key < root.left.key:
return self.rightRotate(root)
else:
root.left = self.leftRotate(root.left)
return self.rightRotate(root)
if balanceFactor < -1:
if key > root.right.key:
return self.leftRotate(root)
else:
root.right = self.rightRotate(root.right)
return self.leftRotate(root)
return root
Suppose you have a tree like this:
5
/ \
/ \
3 7
/ / \
2 6 8
\
9
The tree has a height of 3 (there are 3 branches between the root node 5 and the deepest leaf node 9).
The subtrees' heights are 1 for the left one (rooted at the node 3) and 2 for the right one (rooted at 7), and
3 = H(node(5)) = 1 + max(H(node(3)), H(node(7))) = 1 + max(1, 2)
Now suppose you add a node with a key 4 to the tree:
5
/ \
/ \
3 7
/ \ / \
2 4 6 8
\
9
The height of the tree rooted at node 3 did not increase: H(node(3)) still equals 1.
If you do a proposed replacement in the algorithm, your tree will erroneously get a height of 2 after a described insertion: 1 + H(node(3)), instead of keeping the height equal 3.
IF your code has been actually 'verified' by any programming site, then run away from that site and never trust them again.

How to find all possible reachable numbers from a position?

Given 2 elements n, s and an array A of size m, where s is initial position which lies between 1 <= s <= n, our task is to perform m operations to s and in each operation we either make s = s + A[i] or s = s - A[i], and we have to print all the values which are possible after the m operation and all those value should lie between 1 - n (inclusive).
Important Note: If during an operation we get a value s < 1 or s > n,
we don't go further with that value of s.
I solved the problem using BFS, but the problem is BFS approach is not optimal here, can someone suggest any other more optimal approach to me or an algorithm will greatly help.
For example:-
If n = 3, s = 3, and A = {1, 1, 1}
3
/ \
operation 1: 2 4 (we don’t proceed with 4 as it is > n)
/ \ / \
operation 2: 1 3 3 5
/ \ / \ / \ / \
operation 3: 0 2 2 4 2 4 4 6
So final values reachable by following above rules are 2 and 2 (that is two times 2). we don't consider the third two as it has an intermediate state which is > n ( same case applicable if < 1).
There is this dynamic programming solution, which runs in O(nm) time and requires O(n) space.
First establish a boolean array called reachable, initialize it to false everywhere except for reachable[s], which is true.
This array now represents whether a number is reachable in 0 steps. Now for every i from 1 to m, we update the array so that reachable[x] represents whether the number x is reachable in i steps. This is easy: x is reachable in i steps if and only if either x - A[i] or x + A[i] is reachable in i - 1 steps.
In the end, the array becomes the final result you want.
EDIT: pseudo-code here.
// initialization:
for x = 1 to n:
r[x] = false
r[s] = true
// main loop:
for k = 1 to m:
for x = 1 to n:
last_r[x] = r[x]
for x = 1 to n:
r[x] = (last_r[x + A[k]] or last_r[x - A[k]])
Here last_r[x] is by convention false if x is not in the range [1 .. n].
If you want to maintain the number of ways that each number can be reached, then you do the following changes:
Change the array r to an integer array;
In the initialization, initialize all r[x] to 0, except r[s] to 1;
In the main loop, change the key line to:
r[x] = last_r[x + A[k]] + last_r[x - A[k]]

Ideal height of tree structure

How can I calculate the ideal height of a tree structure?
When I have this tree
I know the height is 4.
There's a formula that says that the ideal height of a tree is 2 ^ height - 1 but that doesn't make sense to me (since it would be 15).
Can someone please explain?
Well, first of all, that formula applies only to binary trees. Second, the ideal number of nodes in the tree will be 2^height-1. For a saturated binary tree of height 4, the number of nodes will be 15.
That formula is for the maximum number of nodes that can be included in a binary tree of that height. Assuming you want the tree to be as shallow as possible, you want to know the minimum height of such a tree given the number of nodes. So you simply invert:
nodes = 2^height - 1
to get
height = log2(nodes + 1)
rounded up.
Height of the tree is the maximum height among all the nodes in the tree. Now say you have a tree
1
/ \
2 3
/ \ / \
4 5 6 7
the height of the tree is 3(since all path lengths are same so lets say 1-2-5 is maximum) now as there are three levels so no of node at each level
1 =2^0
/ \
2 3 =2^1
/ \ / \
4 5 6 7 =2^2
total =2^0 +2^1+2^2= clearly its a gp with sum 2^3-1 ,hence the number of nodes =2^height-1
if you talk about levels(as they start from 0) no of nodes= 2^(level+1)-1

Determine distance between two random nodes in a tree

Given a general tree, I want the distance between two nodes v and w.
Wikipedia states the following:
Computation of lowest common ancestors may be useful, for instance, as part of a procedure for determining the distance between pairs of nodes in a tree: the distance from v to w can be computed as the distance from the root to v, plus the distance from the root to w, minus twice the distance from the root to their lowest common ancestor.
Let's say d(x) denotes the distance of node x from the root which we set to 1. d(x,y) denotes the distance between two vertices x and y. lca(x,y) denotes the lowest common ancestor of vertex pair x and y.
Thus if we have 4 and 8, lca(4,8) = 2 therefore, according to the description above, d(4,8) = d(4) + d(8) - 2 * d(lca(4,8)) = 2 + 3 - 2 * 1 = 3. Great, that worked!
However, the case stated above seems to fail for the vertex pair (8,3) (lca(8,3) = 2) d(8,3) = d(8) + d(3) - 2 * d(2) = 3 + 1 - 2 * 1 = 2. This is incorrect however, the distance d(8,3) = 4 as can be seen on the graph. The algorithm seems to fail for anything that crosses over the defined root.
What am I missing?
You missed that the lca(8,3) = 1, and not = 2. Hence the d(1) == 0 which makes it:
d(8,3) = d(8) + d(3) - 2 * d(1) = 3 + 1 - 2 * 0 = 4
For the appropriate 2 node, namely the one one the right, d(lca(8,2)) == 0, not 1 as you have it in your derivation. The distance from the root--which is the lca in this case--to itself is zero. So
d(8,2) = d(8) + d(2) - 2 * d(lca(8,2)) = 3 + 1 - 2 * 0 = 4
The fact that you have two nodes labeled 2 is probably confusing things.
Edit: The post has been edited so that a node originally labeled 2 is now labeled 3. In this case, the derivation is now correct but the statement
the distance d(8,2) = 4 as can be seen on the graph
is incorrect, d(8,2) = 2.

Analytical solution to predict array size of binary tree

I'm constructing a binary tree for a sequence of data and the tree is stored in a 1-based array. So if index of parent node is idx,
the left child is 2 * idx and the right is 2 * idx + 1.
Every iteration, I sort current sequence based on certain criteria, select the median element as parent, tree[index] = sequence[median], then do same operation on left(the sub sequence before median) and right(the subsequence after median) recursively.
Eg, if 3 elements in total, the tree will be:
1
/ \
2 3
the array size to store the tree is also 3
4 elements:
1
/ \
2 3
/
4
the array size to store the tree is also 4
5 elements:
1
/ \
2 3
/ \ /
4 null 5
the array size to store the tree has to be 6, since there is a hole between 4 and 5.
Thus, the array size is only determined by number of elements, I believe there is an anlytical solution for it, just can't prove it.
Any suggestion will be appreciated.
Thanks.
Every level of a binary tree contains twice as many nodes as the previous level. If you have n nodes, then the number of levels required (the height of the tree) is log2(n) + 1, rounded up to a whole number. So if you have 5 nodes, your binary tree will have a height of 3.
The number of nodes in a full binary tree of height h is (2^h) - 1. So you know that the maximum size array you need for 5 items is 7. Assuming all the levels are filled except possibly the last one.
The last row of your tree will contain (2^h)-1 - n nodes. The last level of a full tree contains 2^(h-1) nodes. Assuming you want it balanced so half of the nodes are on the left and half are on the right, and the right side is left-filled, that is, you want this:
1
2 3
4 5 6 7
8 9 10 11
The number of array spaces required required for the last level of your tree, then, is either 1, or it's half the number required by a full tree, plus half the nodes required by your tree.
So:
n = 5
height = roundUp(log2(n) + 1)
fullTreeNodes = (2^height) - 1
fullTreeLeafNodes = 2^(height-1)
nodesOnLeafLevel = fullTreeNodes - n
Now comes the fun part. If there is more than 1 node required on the leaf level, and you want to balance the sides, you need half of fullTreeLeafNodes, plus half of nodesOnLeafLevel. In the tree above, for example, the leaf level has a potential for 8 nodes. But you only have 4 leaf nodes. You want two of them on the left side, and two on the right. So you need to allocate space for 4 nodes on the left side (2 for the left side items, and 2 empty spaces), plus two more for the two right side items.
if (nodesOnLeafLevel == 1)
arraySize = n
else
arraySize = (fullTreeNodes - fullTreeLeafNodes/2) + (nodesOnLeafLevel / 2)
You really shouldn't have any holes. They are created by your partitioning algorithm, but that algorithm is incorrect.
For 1-5 items, your trees should look like:
1 2 2 3 4
/ \ / \ / \ / \
1 1 3 2 4 2 5
/ / \
1 1 3
The easiest way to populate the tree is to do an in-order traversal of the node locations, filling items from the sequence in order.
I'm close to formalizing a solution. By intuition, first find the maximal power of 2 < N, then check whether the N - 2^m is even or odd, decide which part of the leave level need be growed.
int32_t rup2 = roundUpPower2(nPoints);
if (rup2 == nPoints || rup2 == nPoints + 1)
{
return nPoints;
}
int32_t leaveLevelCapacity = rup2 / 2;
int32_t allAbove = leaveLevelCapacity - 1;
int32_t pointsOnLeave = nPoints - allAbove;
int32_t iteration = roundDownLog2(pointsOnLeave);
int32_t leaveSize = 1;
int32_t gap = leaveLevelCapacity;
for (int32_t i = 1; i <= iteration; ++i)
{
leaveSize += gap / 2;
gap /= 2;
}
return (allAbove + leaveSize);

Resources