Finding the branch with all leaf nodes are true - algorithm

Consider the following binary tree: (taken from here)
Given that leaf nodes will either be true or false, how can I find the branch (or branches) where all of the leaf nodes are true?
So if only 8, 5, 6 or 7 are true, then the first branch wouldn't match (it would need 9 to be true to match), but the second branch would match as all of its leaves are true.
Even identifying this name for this type of search would help so I can Google it.

You can use a recursive function, diving deeper into the tree and determining bottom-up whether for a certain branch all leaves are true. Those branches can then be stored in some list.
Here's some Python code. Call this function with the tree's root node as the first parameter and an empty list as the second, and the list will be populated with the correct branches.
def allTrue(node, trueList=[]):
if isLeaf(node):
return node.value == True
else:
leftTrue = allTrue(node.left, trueList)
rightTrue = allTrue(node.right, trueList)
bothTrue = leftTrue and rightTrue
if bothTrue:
trueList.append(node)
return bothTrue
One thing to look out for: Many programming languages try to be clever, or lazy, by not evaluating the second argument of x and y, if x is false already. In this case, however, this would result in not visiting the right branch if the left branch is not all-true, missing out some all-true branches. Thus the recursive calls better go to separate lines.

You can make a post-order tree traversal and mark each non-leaf node x true if both children have all leafs of their sub-tree true and false otherwise (it's an AND operation of the children's label). This way you recursevly mark the whole tree with your desired property.

Related

Find number of leaves under each node of a tree

I have a tree which is represented in the following format:
nodes is a list of nodes in the tree in the order of their height from top. Node at height 0 is the first element of nodes. Nodes at height 1 (read from left to right) are the next elements of nodes and so on.
n_children is a list of integers such that n_children[i] = num children of nodes[i]
For example given a tree like {1: {2, 3:{4,5,2}}}, nodes=[1,2,3,4,5,2], n_children = [2,0,3,0,0,0].
Given a Tree, is it possible to generate nodes and n_children and the number of leaves corresponding to each node in nodes by traversing the tree only once?
Is such a representation unique? Or is it possible for two different trees to have the same representation?
For the first question - creating the representation given a tree:
I am assuming by "a given tree" we mean a tree that is given in the form of node-objects, each holding its value and a list of references to its children-node-objects.
I propose this algorithm:
Start at node=root.
if node.children is empty return {values_list:[[node.value]], children_list:[[0]]}
otherwise:
3.1. construct two lists. One will be called values_list and each element there shall be a list of values. The other will be called children_list and each element there shall be a list of integers. Each element in these two lists will represent a level in the sub-tree beginning with node, including node itself (will be added at step 3.3).
So values_list[1] will become the list of values of the children-nodes of node, and values_list[2] will become the list of values of the grandchildren-nodes of node. values_list[1][0] will be the value of the leftmost child-node of node. And values_list[0] will be a list with one element alone, values_list[0][0], which will be the value of node.
3.2. for each child-node of node (for which we have references through node.children):
3.2.1. start over at (2.) with the child-node set to node, and the returned results will be assigned back (when the function returns) to child_values_list and child_children_list accordingly.
3.2.2. for each index i in the lists (they are of same length) if there is a list already in values_list[i] - concatenate child_values_list[i] to values_list[i] and concatenate child_children_list[i] to children_list[i]. Otherwise assign values_list[i]=child_values_list[i] and children_list[i]=child.children.list[i] (that would be a push - adding to the end of the list).
3.3. Make node.value the sole element of a new list and add that list to the beginning of values_list. Make node.children.length the sole element of a new list and add that list to the beginning of children_list.
3.4. return values_list and children_list
when the above returns with values_list and children_list for node=root (from step (1)), all we need to do is concatenate the elements of the lists (because they are lists, each for one specific level of the tree). After concatenating the list-elements, the resulting values_list_concatenated and children_list_concatenated will be the wanted representation.
In the algorithm above we visit a node only by starting step (2) with it set as node and we do that only once for each child of a node we visit. We start at the root-node and each node has only one parent => every node is visited exactly once.
For the number of leaves associated with each node: (if I understand correctly - the number of leaves in the sub-tree a node is its root), we can add another list that will be generated and returned: leaves_list.
In the stop-case (no children to node - step (2)) we will return leaves_list:[[1]]. In step (3.2.2) we will concatenate the list-elements like the other two lists' list-elements. And in step (3.3) we will sum the first list-element leaves_list[0] and will make that sum the sole element in a new list that we will add to the beginning of leaves_list. (something like leaves_list.add_to_eginning([leaves_list[0].sum()]))
For the second question - is this representation unique:
To prove uniqueness we actually want to show that the function (let's call it rep for "representation") preserves distinctiveness over the space of trees. i.e. that it is an injection. As you can see in the wiki linked, for that it suffices to show that there exists a function (let's call it tre for "tree") that given a representation gives a tree back, and that for every tree t it holds that tre(rep(t))=t. In simple words - that we can make a method that takes a representation and builds a tree out of it, and for every tree if we make its representation and passes that representation through that methos we'll get the exact same tree back.
So let's get cracking!
Actually the first job - creating that method (the function tre) is already done by you - by the way you explained what the representation is. But let's make it explicit:
if the lists are empty return the empty tree. Otherwise continue
make the root node with values[0] as its value and n_children[0] as its number of children (without making the children nodes yet).
initiate a list-index i=1 and a level index li=1 and level-elements index lei=root.children.length and a next-level-elements accumulator nle_acc=0
while lei>0:
4.1. for lei times:
4.1.1. make a node with values[i] as value and n_children[i] as the number of children.
4.1.2. add the new node as the leftmost child in level li that has not been filled yet (traverse the tree to the li level from the leftmost in right direction and assign the new node to the first reference that is not assigned yet. We know the previous level is done, so each node in the li-1 level has a children.length property we can check and see if each has filled the number of children they should have)
4.1.3. add nle_acc+=n_children[i]
4.1.4. increment ++i
4.2. assign lei=nle_acc (level-elements can take what the accumulator gathered for it)
4.3. clear nle_acc=0 (next-level-elements accumulator needs to accumulate from the start for the next round)
Now we need to prove that an arbitrary tree that is passed through the first algorithm and then through the second algorithm (this one here) will get out of all of that the same as it was originally.
As I'm not trying to prove the corectness of the algorithms (although I should), let's assume they do what I intended them to do. i.e. the first one writes the representation as you described it, and the second one makes a tree level-by-level, left-to-right, assigning a value and the number of children from the representation and fills the children references according to those numbers when it comes to the next level.
So each node has the right amount of children according to the representation (that's how the children were filled), and that number was written from the tree (when generating the representation). And the same is true for the values and thus it is the same tree as the original.
The proof actually should be much more elaborate and detailed - but I think I'll leave it at that now. If there will be a demand for elaboration maybe I'll make it an actual proof.

how to check that if a binary tree isComplete() in data structure?

how to check if a binary tree isComplete() in data structure?
the leaves nodes have to be filled from left to right without gap
You could have a recursive method that queries the left and right child to see if they have two children each and if they do, returns a true. Then you just call that method and pass it your root and it'll return either true or false after recursing through each of its children and their children.

Is null a binary tree?

So I have seen a few examples such as
How to validate a Binary Search Tree?
http://www.geeksforgeeks.org/check-if-a-binary-tree-is-subtree-of-another-binary-tree/
They return 1, or true is a tree is null.
Expanding questions a bit - assuming I had to find if TreeSmall is subtree of TreeBig, and my TreeSmall is null, should the return value of checkSubtree(smallTree) true or false ? A true indicates TreeSmall was a tree with value of null. This does not make sense to me.
In pure computer science, null is a valid binary tree. It is called an empty binary tree. Just like an empty set is still a valid set. Furthermore, a binary tree with only a single root node and no children is also valid (but not empty). See this Stack Overflow answer for more information.
In practical implementation, there are two ways to go about it though.
Assume that a valid binary tree must have at least one node and do not allow empty trees. Each node does not have to have children. All recursive methods on this tree do not descend to the level of null. Rather, they stop when they see that the left child or right child of a node is null. This implementation works as long as you don't pass null to any place where a tree is expected.
Assume that null is a valid binary tree (formally, just the empty tree). In this implementation, you first check if the pointer is null before doing any operations on it (like checking for left/right children, etc.) This implementation works for any pointer to a tree. You can freely pass null pointers to methods that are expecting a tree.
Both ways work. The second implementation has the advantage of flexibility. You can pass null to anything that expects a tree and it will not raise an exception. The first implementation has the advantage of not wasting time descending to "child nodes" that are null and you don't have to use null checks at the beginning of every function/method which operates on a node. You simply have to do null checks for the children instead.
That depends on the application and is a question of definition.
Edit:
For example Wikipedia "defines" a BST as follows:
In computer science, a binary search tree (BST), sometimes also called an ordered or sorted binary tree, is a node-based binary tree data structure which has the following properties:
The left subtree of a node contains only nodes with keys less than the node's key.
The right subtree of a node contains only nodes with keys greater than the node's key.
The left and right subtree must each also be a binary search tree.
There must be no duplicate nodes
Let's test those for null:
left subtree doesn't exist, so there are no nodes violating this rule -> check
similiar to first -> check
if all this tests are passed this passes too, since both subtrees are null -> check
of course there aren't duplicates -> check
So by this definition null is a valid BST. You could inverse this by also requiring "there must be one root node". Which doesn't affect any of the practical properties of a BST but might in an explicit application.
Ex falso sequitur quodlibet - since "null" is nothing at all, it can be interpreted to be anything. It is really a matter of design. Some people may claim that checkSubTree() should thrown something like an IllegalArgumentException in this case. Another approach would be to introduce a special kind of typed object or instance which represents an empty tree (cf. NullObjectPattern). Such a null-object would be a tree by all accounts, e.g. EmptyTree instanceof Tree, while null instanceof Tree would always be false.

Applying a Logarithm to Navigate a Tree

I had once known of a way to use logarithms to move from one leaf of a tree to the next "in-order" leaf of a tree. I think it involved taking a position value (rank?) of the "current" leaf and using it as a seed for a fresh traversal from the root down to the new target leaf - all the way using a log function test to determine whether to follow the right or left node down to the leaf.
I no longer recall how to exercise that technique. Can anyone re-introduce me?
I also don't recall if the technique required the tree to be balanced, or if it worked on n-trees or only binary trees. Any info would be appreciated.
Since you mentioned whether to go left or right, I'm going to assume you're talking about a binary tree specifically. In that case, I think you're right that there is a way. If your nodes are numbered left-to-right, top-to-bottom, starting with 1, then you can find the rank (depth in the tree) by taking the log2 of the node's number. To find that node again from the root, you can use the binary representation of the number, where 0 = left and 1 = right.
For example:
n = 11
11 in binary is 1011
We always ignore the first 1 since it's going to be there for every number (all nodes of rank n will be binary numbers with n+1 digits, with the first digit being 1). We're left with 011, which is saying from the root go left, then right, then right.
If you want to find the next in-order leaf, take the current leaf's number and add one, then traverse from the root using this method.
I believe this only works with balanced binary trees.
OK, this proposal requires more characters than I can fit into a comment box. Steven does not believe that knowing the depth of the node in the tree is useful. I think it is. I have been wrong in the past, and I'm sure I'll be wrong in the future, so I will try to explain how this idea works in an attempt to not be wrong in the present. If I am, I apologize ahead of time. I'm nearly certain I got it from one of my Algorithms and Datastructures courses, using the CLR book. Please excuse any slips in notation or nomenclature, I haven't studied this stuff in a while.
Quoting wikipedia, "a complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible."
We are considering a complete tree with any branching degree (where a binary tree has a branching degree of two). Also, we are considering our nodes to have a 'positional value' which is an ordering of the positional value (top to bottom, left to right) of the node.
Now, if we are given a positional value, we can find the node in the following fashion. Take the log_base_n of the positional value of the element we are looking for (floor of this, we want an integer). Traverse down from the root that many times, minus one. Now, start looking through all the children of the nodes at this level. Your node you are searching for will be in this set.
This is an attempt in explaining the additional part of the wikipedia definition:
"This depth is equal to the integer part of log2(n) where n
is the number of nodes on the balanced tree.
Example 1: balanced tree with 1 node, log2(1) = 0 (depth = 0).
Example 2: balanced tree with 3 nodes, log2(3) = 1.59 (depth=1).
Example 3: balanced tree with 5 nodes, log2(5) = 2.32
(depth of tree is 2 nodes)."
This is useful, because you can simply traverse down to this level and then start looking around. It is useful and important to know the depth your node is located on, so you can start looking there, instead of starting to look at the beginning. Unless you know what level of the tree you are on, you get to start looking at all the nodes sequentially.
That is why I think it is helpful to know the depth of the node we are searching for.
It is a little bit odd, since having the "positional value" is not something we normally care about in a tree. I can see why Steve thought of this in terms of an array, since positional value is inherent in arrays.
-Brian J. Stinar-
Something that at least resembles your description is the Binary Heap, used a.o. in Priority Queues.
I think I've found the answer, or at least a facsimile.
Assume the tree nodes are numbered, starting at 1, top-down and left-to-right. Assume traversal begins at the root, and halts when it finds node X (which means the parent is linked to its children). Also, for quick reference, the base 2 logarithmic values for nodes 1 through 12 are:
log2(1) = 0.0
log2(2) = 1
log2(3) = 1.58
log2(4) = 2
log2(5) = 2.32
log2(6) = 2.58
log2(7) = 2.807
log2(8) = 3
log2(9) = 3.16
log2(10) = 3.32
log2(11) = 3.459
log2(12) = 3.58
The fractional portion represents a unique diagonal position (notice how nodes 3, 6, and 12 all have fractional portion 0.58). Also notice that every node belongs either to the left or right side of the tree, depending on whether the log fractional component is less or great than 0.5. Anecdotes aside, the algorithm for finding a node is then as follows:
examine fractional portion, if it is less than .5, turn left. Else turn right.
subtract one from the whole number portion of the log, stop if the value reaches zero.
double the fractional portion, and start over.
So, for example, if node 11 is what you seek then you start by computing the log which is 3.459. Then...
3-459 <=fraction less than .5: turn left and decrement whole number to 2.
2-918 <=doubled fraction more than .5: turn right and decrement whole number to 1.
1-836 <=doubling .918 gives 1.836: but only fractional part counts: turn right and dec prior whole number to 0. Done!!
With appropriate accomodations, the same technique appears to work for any balanced n-ary tree. For example, given a balanced ternary tree, the choice of following left, middle, or right edges is again based on the fractional portion of the log, as follows:
between 0.5-0.832: turn left (a one-third fraction range)
between 0.17-0.49: turn right (another one-third fraction range)
otherwise go down the middle. (the last one-third range)
The algorithm is adjusted by multiplying the fractional portion by 3 instead of 2. Again, a quick reference for those who want to test this last statement:
log3(1) = 0.0
log3(2) = 0.63
log3(3) = 1
log3(4) = 1.26
log3(5) = 1.46
log3(6) = 1.63
log3(7) = 1.77
log3(8) = 1.89
log3(9) = 2
At this point I wonder if there is an even more concise way to express this whole "log-based top-down selection of a node." I'm interested if anyone knows...
Case 1: Nodes have pointers to their parent
Starting from the node, traverse up the parent pointer until one with non-null right_child is found. Go to the right_child and traverse left_child as long as they are non-null.
Case 2: Nodes do not have pointers to the parent
Starting from the root, find the path to the node (including the root and the node). Then find the latest vertex (i.e. a node) in the path that has non-null right_child. Go the the right_child and traverse left_child as long as they are non-null.
In both cases, we traversing either up or down from the root to one of the nodes. The maximum of such traversal is in the order of the depth of the tree, hence logarithmic in the size of the nodes if the tree is balanced.

Shortest branch in a binary tree?

A binary tree can be encoded using two functions l and r
such that for a node n, l(n) give the left child of n, r(n)
give the right child of n.
A branch of a tree is a path from the root to a leaf, the
length of a branch to a particular leaf is the number of
arcs on the path from the root to that leaf.
Let MinBranch(l,r,x) be a simple recursive algorithm for
taking a binary tree encoded by the l and r functions
together with the root node x for the binary tree and
returns the length of the shortest branch of the binary
tree.
Give the pseudocode for this algorithm.
OK, so basically this is what I've come up with so far:
MinBranch(l, r, x)
{
if x is None return 0
left_one = MinBranch(l, r, l(x))
right_one = MinBranch(l, r, r(x))
return {min (left_one),(right_one)}
}
Obviously this isn't great or perfect. I'd be greatful if
people can help me get this perfect and working - any help
will be appreciated.
I doubt anyone will solve homework for you straight-up. A clue: the return value must surely grow higher as the tree gets bigger, right? However I don't see any numeric literals in your function except 0, and no addition operators either. How will you ever return larger numbers?
Another angle on the same issue: anytime you write a recursive function, it helps to enumerate "what are all the conditions where I should stop calling myself? what I return in each circumstance?"
You're on the right approach, but you're not quite there; your recursive algorithm will always return 0. (the logic is almost right, though...)
note that the length of the sub-branches is one less than the length of the branch; so left_one and right_one should be 1 + MinBranch....
Steping through the algorithm with some sample trees will help uncover off-by-one errors like this one...
It looks like you almost have it, but consider this example:
4
3 5
When you trace through MinBranch, you'll see that in your
MinBranch(l,r,4) call:
left_one = MinBranch(l, r, l(x))
= MinBranch(l, r, l(4))
= MinBranch(l, r, 3)
= 0
That makes sense, after all, 3 is a leaf node, so of course the distance
to the closest leaf node is 0. The same happens for right_one.
But you then wind up here:
return {min (left_one),(right_one)}
= {min (0), (0) }
= 0
but that's clearly wrong, because this node (4) is not a leaf node. Your
code forgot to count the current node (oops!). I'm sure you can manage
to fix that.
Now, actually, they way you're doing this isn't the fastest, but I'm not
sure if that's relevant for this exercise. Consider this tree:
4
3 5
2
1
Your algorithm will count up the left branch recursively, even though it
could, hypothetically, bail out if you first counted the right branch
and noted that 3 has a left, so its clearly longer than 5 (which is a
leaf). But, of course, counting the right branch first doesn't always
work!
Instead, with more complicated code, and probably a tradeoff of greater
memory usage, you can check nodes left-to-right, top-to-bottom (just
like English reading order) and stop at the first leaf you find.
What you've created can be thought of as a depth-first search. However, given what you're after (shortest branch), this may not be the most efficent approach. Think about how your algorithm would perform on a tree that was very heavy on the left side (of the root node), but had only one node on the right side.
Hint: consider a breadth-first search approach.
What you have there looks like a depth first search algorithm which will have to search the entire tree before you come up with a solution. what you need is the breadth first search algorithm which can return as soon as it finds the solution without doing a complete search

Resources