Is such a binary tree possible? I've drawn out what I believe to be all possible iterations, and I cannot find a tree that satisfies these properties. Note that this is not a BST, so the values of the keys don't matter. There are countless with exactly 1 "single child" node, such as :
a
/ \
b c
/ //b is only such node
d
/ \
e f
And many with exactly 3 "single child" nodes:
a
/
b
/ //a, b, and d
c
/ \
d e
/
f
Does such a binary tree exist (6 nodes, exactly 2 nodes with exactly 1 child)? If so, please provide an example.
This is an impossible structure to create with the standard binary tree containing 2 child pointers. If you had an nontraditional tree with 3 child pointers this would be possible.
Related
I'm trying to define an algorithm that returns the number of leaves at the lowest level of a complete binary tree. By a complete binary tree, I mean a binary tree whose every level, except possibly the last, is filled, and all nodes in the last level are as far left as possible.
For example, if I had the following complete binary tree,
_ 7_
/ \
4 9
/ \ / \
2 6 8 10
/ \ /
1 3 5
the algorithm would return '3' since there are three leaves at the lowest level of the tree.
I've been able to find numerous solutions for finding the count of all the leaves in regular or balanced binary trees, but so far I haven't had any luck with the particular case of finding the count of the leaves at the lowest level of a complete binary tree. Any help would be appreciated.
Do a breadth-first search, so you can aswell find a number of nodes on each level.
Some pseudo code
q <- new queue of (node, level) data
add (root, 0) in q
nodesPerLevel <- new vector of integers
while q is not empty:
(currentNode, currentLevel) <- take from top of q
nodesPerLevel[currentLevel] += 1
for each child in currentNode's children:
add (child, currentLevel + 1) in q
return last value of nodesPerLevel
I was asked this question in an interview and struggled to answer it correctly in the time allotted. Nonetheless, I thought it was an interesting problem, and I hadn't seen it before.
Suppose you have a tree where the root can call (on the phone) each of it's children, when a child receives the call, he can call each of his children, etc. The problem is that each call must be done in a number of rounds, and we need to minimize the number of rounds it takes to make the calls. For example, suppose you have the following tree:
A
/ \
/ \
B D
|
|
C
One solution is for A to call D in round one, A to call B in round two, and B to call C in round three. The optimal solution is for A to call B in round one, and A to call D and B to call C in round two.
Note that A cannot call both B and D in the same round, nor can any node call more than one of its children in the same round. However, multiple nodes with a different parent can call simultaneously. For example, given the tree:
A
/ | \
/ | \
B C D
/\ |
/ \ |
E F G
We can have a sequence (where - separates rounds), such as:
A B - B E, A D - B F, A C, D G
(A calls B first round, B calls E and A calls D second, ...)
I'm assuming some type of dynamic programming can be used, but I'm not sure which direction to take this in. My initial inclination is to use DFS to order the longest path from the root to leaves in decreasing order, but when it comes to the nodes actually making calls, I'm not sure how we can achieve optimality given any tree, not how we can output the paths that the optimal calls would make (i.e. in the first example we could output
A B - B C, A D
I think something like this could get the optimal solution:
suppose the value of 'calls' for each of leaves is 1
for each node get the value of calls for all of his children and rank them according to their 'calls' value
consider rank of each child as 'ranks'
to compute the value of 'calls' for each node loop over his children (after computing their ranks) and find the maximum value of 'calls' + 'ranks'
'calls' value of the root node is the answer
It's sorta dynamic programming on trees and you can implement it recursively like this:
int f(node v)
{
int s = 0;
for each u in v.children
{
d[u] = f(u)
}
sort d and rank its values in r (r for the maximum u would be 1)
for each u in v.children
{
s = max(s, d[u] + r[u] + 1)
}
return s
}
Good Luck!
I am studying data structures and algorithms and this thing is really confusing me
Height of a binary tree, as it is also used in AVL search tree.
According to the book I am following "DATA STRUCTURES by Lipschutz" , it says "the depth (or height) of a tree T is the maximum number of nodes in a branch of T. This turns out to be 1 more than the largest level number of T. The tree 7 in figure 7.1 has depth 5."
figure 7.1 :
A
/ \
/ \
/ \
/ \
B C
/ \ / \
D E G H
/ / \
F J K
/
L
But, on several other resources, height has been calculated differently, though same definition is given. For example as I was reading from internet http://www.cs.utexas.edu/users/djimenez/utsa/cs3343/lecture5.html
" Here is a sample binary tree:
1
/ \
/ \
/ \
/ \
2 3
/ \ / \
/ \ / \
/ \ / \
6 7 4 5
/ \ / /
9 10 11 8
The height of a tree is the maximum of the depths of all the nodes. So the tree above is of height 3. "
Another source http://www.comp.dit.ie/rlawlor/Alg_DS/searching/3.%20%20Binary%20Search%20Tree%20-%20Height.pdf
says, "Height of a Binary Tree
For a tree with just one node, the root node, the height is defined to be 0, if there are 2
levels of nodes the height is 1 and so on. A null tree (no nodes except the null node)
is defined to have a height of –1. "
Now these last 2 explanations comply with each other but not with the example given in the book.
Another source says "There are two conventions to define height of Binary Tree
1) Number of nodes on longest path from root to the deepest node.
2) Number of edges on longest path from root to the deepest node.
In this post, the first convention is followed. For example, height of the below tree is 3.
1
/ \
2 3
/ \
4 5
"
In this, I want to ask how can the number of nodes and edges between root and leaf be the same ?
And what will be the height of a leaf node, according to the book it should be 1 (as the number of largest level is 0, so height should be 0+1=1,
but it is usually said height of leaf node is 0.
Also why does the book mention depth and height as the same thing?
This thing is really really confusing me, I have tried clarifying from a number of sources but cant seem to choose between the two interpretations.
Please help.
==> I would like to add to it since now I am accepting the conventions of the book,
in the topic of AVL search trees where we need to calculate the BALANCE FACTOR (which is the difference of the heights left and right subtrees)
it says :
C (-1)
/ \
(0) A G (1)
/
D (0)
The numbers in the brackets are the balance factors.
Now, If I am to follow the book height of D is 1 and right subtree of G has height (-1) since its empty, so Balance factor of G should be = 1-(-1)=2!
Now why has it taken the height of D to be 0 here ?
PLEASE HELP.
The exact definition of height doesn't matter if what you care about is balance factor. Recall that balance factor is
height(left) - height(right)
so if both are one larger or one smaller than in your favorite definition of height, the balance factor doesn't change, as long as you redefine the height of an empty tree accordingly.
Now the problems is that the "maximum number of nodes in a branch" definition is both recursive but doesn't specify a base case. But since the height of a single-element tree is one according to this definition, the obvious choice for the height of a zero-element tree is zero, and if you work out the formulas you'll find this works.
You can also arrive at the zero value by observing that the base case of the other definition is -1, and otherwise it always gives a value one less than the "max. nodes in a branch" definition.
Consider the following array, which is claimed to have represented a binary tree:
[1, 2, 5, 6, -1, 8, 11]
Given that the index with value -1 indicates the root element, I've below questions:
a) How is this actually represented?
Should we follow below formulae (source from this link) to figure out the tree?
Three simple formulae allow you to go from the index of the parent to the index of its children and vice versa:
* if index(parent) = N, index(left child) = 2*N+1
* if index(parent) = N, index(right child) = 2*N+2
* if index(child) = N, index(parent) = (N-1)/2 (integer division with truncation)
If we use above formulae, then index(root) = 3, index(left child) = 7, which doesn't exist.
b) Is it important to know whether it's a complete binary tree or not?
N=0 must be the root node since by the rules listed, it has no parent. 0 cannot be created from either of the expressions (2*N + 1) or (2*N + 2), assuming no negative N.
Note, index is not the value stored in the array, it is the place in the array.
For [1, 2, 5, 6, -1, 8, 11]
Index 0 = 1
Index 1 = 2
Index 2 = 5, etc.
If it is a complete tree, then -1 is a valid value and tree is
1
/ \
2 5
/ \ / \
6 -1 8 11
-1 could also be a "NULL" pointer, indicating no value exists at that node.
So the Tree would look like
1
/ \
2 5
/ / \
6 8 11
Given an array, you could think of any number of ways how could that array represent a binary tree. So there is no way to know, you have to go to the source of that array (whatever that is).
One of those ways is the way binary heap is usually represented, as per your link. If this was the representation used, -1 would not be the root element. And the node at position 3 would have no children, i.e. it would be a leaf.
And, yeah, it's probably important to know whether it's supposed to be a complete tree or not.
In general, you shouldn't try to figure out what does some data mean like this. You should be given documentation or the source code that uses the data. If you don't have that and you really need to reverse-engineer it, you most likely need to know more about the data. Observing the behavior of the code that uses it should help you. Or decompiling the code.
It may not be a complete binary tree, but it may not be an arbitrary one either. You can represent a tree in which at most a few of the rightmost few leaves are missing (or, if you exchange the convention for left and right children, at most a few of the leftmost leaves missing).
You can't represent this in your array:
A
/ \
B C
/ /
D E
But you can represent this
A
/ \
B C
/ \
D E
or this:
A
/ \
B C
/ \
D E
(for the last, have 2k+1 be the right child and 2k+2 the left child)
You only need to know to number of nodes in the three.
I have multiple binary trees stored as an array. In each slot is either nil (or null; pick your language) or a fixed tuple storing two numbers: the indices of the two "children". No node will have only one child -- it's either none or two.
Think of each slot as a binary node that only stores pointers to its children, and no inherent value.
Take this system of binary trees:
0 1
/ \ / \
2 3 4 5
/ \ / \
6 7 8 9
/ \
10 11
The associated array would be:
0 1 2 3 4 5 6 7 8 9 10 11
[ [2,3] , [4,5] , [6,7] , nil , nil , [8,9] , nil , [10,11] , nil , nil , nil , nil ]
I've already written simple functions to find direct parents of nodes (simply by searching from the front until there is a node that contains the child)
Furthermore, let us say that at relevant times, both all trees are anywhere between a few to a few thousand levels deep.
I'd like to find a function
P(m,n)
to find the lowest common ancestor of m and n -- to put more formally, the LCA is defined as the "lowest", or deepest node in which have m and n as descendants (children, or children of children, etc.). If there is none, a nil would be a valid return.
Some examples, given our given tree:
P( 6,11) # => 2
P( 3,10) # => 0
P( 8, 6) # => nil
P( 2,11) # => 2
The main method I've been able to find is one that uses an Euler trace, which turns the given tree (Adding node A as the invisible parent of 0 and 1, with a "value" of -1), into:
A-0-2-6-2-7-10-7-11-7-2-0-3-0-A-1-4-1-5-8-5-9-5-1-A
And from that, simply find the node between your given m and n that has the lowest number; For example, to find P(6,11), look for a 6 and an 11 on the trace. The number between them that is the lowest is 2, and that's your answer. If A (-1) is in between them, return nil.
-- Calculating P(6,11) --
A-0-2-6-2-7-10-7-11-7-2-0-3-0-A-1-4-1-5-8-5-9-5-1-A
^ ^ ^
| | |
m lowest n
Unfortunately, I do believe that finding the Euler trace of a tree that can be several thousands of levels deep is a bit machine-taxing...and because my tree is constantly being changed throughout the course of the programming, every time I wanted to find the LCA, I'd have to re-calculate the Euler trace and hold it in memory every time.
Is there a more memory efficient way, given the framework I'm using? One that maybe iterates upwards? One way I could think of would be the "count" the generation/depth of both nodes, and climb the lowest node until it matched the depth of the highest, and increment both until they find someone similar.
But that'd involve climbing up from level, say, 3025, back to 0, twice, to count the generation, and using a terribly inefficient climbing-up algorithm in the first place, and then re-climbing back up.
Are there any other better ways?
Clarifications
In the way this system is built, every child will have a number greater than their parents.
This does not guarantee that if n is in generation X, there are no nodes in generation (X-1) that are greater than n. For example:
0
/ \
/ \
/ \
1 2 6
/ \ / \ / \
2 3 9 10 7 8
/ \ / \
4 5 11 12
is a valid tree system.
Also, an artifact of the way the trees are built are that the two immediate children of the same parent will always be consecutively numbered.
Are the nodes in order like in your example where the children have a larger id than the parent? If so, you might be able to do something similar to a merge sort to find them.. for your example, the parent tree of 6 and 11 are:
6 -> 2 -> 0
11 -> 7 -> 2 -> 0
So perhaps the algorithm would be:
left = left_start
right = right_start
while left > 0 and right > 0
if left = right
return left
else if left > right
left = parent(left)
else
right = parent(right)
Which would run as:
left right
---- -----
6 11 (right -> 7)
6 7 (right -> 2)
6 2 (left -> 2)
2 2 (return 2)
Is this correct?
Maybe this will help: Dynamic LCA Queries on Trees.
Abstract:
Richard Cole, Ramesh Hariharan
We show how to maintain a data
structure on trees which allows for
the following operations, all in
worst-case constant time. 1. Insertion
of leaves and internal nodes. 2.
Deletion of leaves. 3. Deletion of
internal nodes with only one child. 4.
Determining the Least Common Ancestor
of any two nodes.
Conference: Symposium on Discrete
Algorithms - SODA 1999
I've solved your problem in Haskell. Assuming you know the roots of the forest, the solution takes time linear in the size of the forest and constant additional memory. You can find the full code at http://pastebin.com/ha4gqU0n.
The solution is recursive, and the main idea is that you can call a function on a subtree which returns one of four results:
The subtree contains neither m nor n.
The subtree contains m but not n.
The subtree contains n but not m.
The subtree contains both m and n, and the index of their least common ancestor is k.
A node without children may contain m, n, or neither, and you simply return the appropriate result.
If a node with index k has two children, you combine the results as follows:
join :: Int -> Result -> Result -> Result
join _ (HasBoth k) _ = HasBoth k
join _ _ (HasBoth k) = HasBoth k
join _ HasNeither r = r
join _ r HasNeither = r
join k HasLeft HasRight = HasBoth k
join k HasRight HasLeft = HasBoth k
After computing this result you have to check the index k of the node itself; if k is equal to m or n, you will "extend" the result of the join operation.
My code uses algebraic data types, but I've been careful to assume you need only the following operations:
Get the index of a node
Find out if a node is empty, and if not, find its two children
Since your question is language-agnostic I hope you'll be able to adapt my solution.
There are various performance tweaks you could put in. For example, if you find a root that has exactly one of the two nodes m and n, you can quit right away, because you know there's no common ancestor. Also, if you look at one subtree and it has the common ancestor, you can ignore the other subtree (that one I get for free using lazy evaluation).
Your question was primarily about how to save memory. If a linear-time solution is too slow, you'll probably need an auxiliary data structure. Space-for-time tradeoffs are the bane of our existence.
I think that you can simply loop backwards through the array, always replacing the higher of the two indices by its parent, until they are either equal or no further parent is found:
(defun lowest-common-ancestor (array node-index-1 node-index-2)
(cond ((or (null node-index-1)
(null node-index-2))
nil)
((= node-index-1 node-index-2)
node-index-1)
((< node-index-1 node-index-2)
(lowest-common-ancestor array
node-index-1
(find-parent array node-index-2)))
(t
(lowest-common-ancestor array
(find-parent array node-index-1)
node-index-2))))