how to check if a binary tree isComplete() in data structure?
the leaves nodes have to be filled from left to right without gap
You could have a recursive method that queries the left and right child to see if they have two children each and if they do, returns a true. Then you just call that method and pass it your root and it'll return either true or false after recursing through each of its children and their children.
Related
I have a tree which is represented in the following format:
nodes is a list of nodes in the tree in the order of their height from top. Node at height 0 is the first element of nodes. Nodes at height 1 (read from left to right) are the next elements of nodes and so on.
n_children is a list of integers such that n_children[i] = num children of nodes[i]
For example given a tree like {1: {2, 3:{4,5,2}}}, nodes=[1,2,3,4,5,2], n_children = [2,0,3,0,0,0].
Given a Tree, is it possible to generate nodes and n_children and the number of leaves corresponding to each node in nodes by traversing the tree only once?
Is such a representation unique? Or is it possible for two different trees to have the same representation?
For the first question - creating the representation given a tree:
I am assuming by "a given tree" we mean a tree that is given in the form of node-objects, each holding its value and a list of references to its children-node-objects.
I propose this algorithm:
Start at node=root.
if node.children is empty return {values_list:[[node.value]], children_list:[[0]]}
otherwise:
3.1. construct two lists. One will be called values_list and each element there shall be a list of values. The other will be called children_list and each element there shall be a list of integers. Each element in these two lists will represent a level in the sub-tree beginning with node, including node itself (will be added at step 3.3).
So values_list[1] will become the list of values of the children-nodes of node, and values_list[2] will become the list of values of the grandchildren-nodes of node. values_list[1][0] will be the value of the leftmost child-node of node. And values_list[0] will be a list with one element alone, values_list[0][0], which will be the value of node.
3.2. for each child-node of node (for which we have references through node.children):
3.2.1. start over at (2.) with the child-node set to node, and the returned results will be assigned back (when the function returns) to child_values_list and child_children_list accordingly.
3.2.2. for each index i in the lists (they are of same length) if there is a list already in values_list[i] - concatenate child_values_list[i] to values_list[i] and concatenate child_children_list[i] to children_list[i]. Otherwise assign values_list[i]=child_values_list[i] and children_list[i]=child.children.list[i] (that would be a push - adding to the end of the list).
3.3. Make node.value the sole element of a new list and add that list to the beginning of values_list. Make node.children.length the sole element of a new list and add that list to the beginning of children_list.
3.4. return values_list and children_list
when the above returns with values_list and children_list for node=root (from step (1)), all we need to do is concatenate the elements of the lists (because they are lists, each for one specific level of the tree). After concatenating the list-elements, the resulting values_list_concatenated and children_list_concatenated will be the wanted representation.
In the algorithm above we visit a node only by starting step (2) with it set as node and we do that only once for each child of a node we visit. We start at the root-node and each node has only one parent => every node is visited exactly once.
For the number of leaves associated with each node: (if I understand correctly - the number of leaves in the sub-tree a node is its root), we can add another list that will be generated and returned: leaves_list.
In the stop-case (no children to node - step (2)) we will return leaves_list:[[1]]. In step (3.2.2) we will concatenate the list-elements like the other two lists' list-elements. And in step (3.3) we will sum the first list-element leaves_list[0] and will make that sum the sole element in a new list that we will add to the beginning of leaves_list. (something like leaves_list.add_to_eginning([leaves_list[0].sum()]))
For the second question - is this representation unique:
To prove uniqueness we actually want to show that the function (let's call it rep for "representation") preserves distinctiveness over the space of trees. i.e. that it is an injection. As you can see in the wiki linked, for that it suffices to show that there exists a function (let's call it tre for "tree") that given a representation gives a tree back, and that for every tree t it holds that tre(rep(t))=t. In simple words - that we can make a method that takes a representation and builds a tree out of it, and for every tree if we make its representation and passes that representation through that methos we'll get the exact same tree back.
So let's get cracking!
Actually the first job - creating that method (the function tre) is already done by you - by the way you explained what the representation is. But let's make it explicit:
if the lists are empty return the empty tree. Otherwise continue
make the root node with values[0] as its value and n_children[0] as its number of children (without making the children nodes yet).
initiate a list-index i=1 and a level index li=1 and level-elements index lei=root.children.length and a next-level-elements accumulator nle_acc=0
while lei>0:
4.1. for lei times:
4.1.1. make a node with values[i] as value and n_children[i] as the number of children.
4.1.2. add the new node as the leftmost child in level li that has not been filled yet (traverse the tree to the li level from the leftmost in right direction and assign the new node to the first reference that is not assigned yet. We know the previous level is done, so each node in the li-1 level has a children.length property we can check and see if each has filled the number of children they should have)
4.1.3. add nle_acc+=n_children[i]
4.1.4. increment ++i
4.2. assign lei=nle_acc (level-elements can take what the accumulator gathered for it)
4.3. clear nle_acc=0 (next-level-elements accumulator needs to accumulate from the start for the next round)
Now we need to prove that an arbitrary tree that is passed through the first algorithm and then through the second algorithm (this one here) will get out of all of that the same as it was originally.
As I'm not trying to prove the corectness of the algorithms (although I should), let's assume they do what I intended them to do. i.e. the first one writes the representation as you described it, and the second one makes a tree level-by-level, left-to-right, assigning a value and the number of children from the representation and fills the children references according to those numbers when it comes to the next level.
So each node has the right amount of children according to the representation (that's how the children were filled), and that number was written from the tree (when generating the representation). And the same is true for the values and thus it is the same tree as the original.
The proof actually should be much more elaborate and detailed - but I think I'll leave it at that now. If there will be a demand for elaboration maybe I'll make it an actual proof.
I have an unordered tree in the form of, for example:
Root
A1
A1_1
A1_1_1
A1_1_2
A1_1_2_1
A1_1_2_2
A1_1_2_3
A1_1_3
A1_1_n
A1_2
A1_3
A1_n
A2
A2_1
A2_2
A2_3
A2_n
The tree is unordered
each child can have a random N count of children
each node stores an unique long value.
the value required can be at any position.
My problem: if I need the long value of A1_1_2_3, first time I will traverse the nodes I do depth first search to get it, however: on later calls to the same node I must get its value without a recursive search. Why? If this tree would have hundreds of thousands of nodes until it reaches my A1_1_2_3 node, it would take too much time.
What I thought of, is to leave some pointers after the first traverse. E.g. for my case, when I give back the long value for A1_1_2_3 I also give back an array with information for future searches of the same node and say: to get to A1_1_2_3, I need:
first child of Root, which is A1
first child of A1, which is A1_1
second child of A1_1, which is A1_1_2
third child of A1_1_2, which is what I need: A1_1_2_3
So I figured I would store this information along with the value for A1_1_2_3 as an array of indexes: [0, 0, 1, 2]. By doing so, I could easily recreate the node on subsequent calls to the A1_1_2_3 and avoid recursion each time.
However the nodes can change. On subsequent calls, I might have a new structure, so my indexes stored earlier would not match anymore. But if this happens, I thought whnever I dont find the element anymore, I would recursively go back up a level and search for the item, and so on until I find it again and store the indexes again for future references:
e.g. if my A1_1_2_3 is now situated in this new structure:
A1_1
A1_1_0
A1_1_1
A1_1_2
A1_1_2_1
A1_1_2_2
A1_1_21_22
A1_1_2_3
... in this case the new element A1_1_0 ruined my stored structure, so I would go back up a level and search children again recursively until I find it again.
Does this even make sense, what I thought of here, or am I overcomplicating things? Im talking about an unordered tree which can have max about three hundreds of thousands of nodes, and it is vital that I can jump to nodes as fast as possible. But the tree can also be very small, under 10 nodes.
Is there a more efficient way to search in such a situation?
Thank you for any idea.
edit:
I forgot to add: what I need on subsequent calls is not just the same value, but also its position is important, because I must get the next page of children after that child (since its a tree structure, Im calling paging on nodes after the initially selected one). Hope it makes more sense now.
So I have seen a few examples such as
How to validate a Binary Search Tree?
http://www.geeksforgeeks.org/check-if-a-binary-tree-is-subtree-of-another-binary-tree/
They return 1, or true is a tree is null.
Expanding questions a bit - assuming I had to find if TreeSmall is subtree of TreeBig, and my TreeSmall is null, should the return value of checkSubtree(smallTree) true or false ? A true indicates TreeSmall was a tree with value of null. This does not make sense to me.
In pure computer science, null is a valid binary tree. It is called an empty binary tree. Just like an empty set is still a valid set. Furthermore, a binary tree with only a single root node and no children is also valid (but not empty). See this Stack Overflow answer for more information.
In practical implementation, there are two ways to go about it though.
Assume that a valid binary tree must have at least one node and do not allow empty trees. Each node does not have to have children. All recursive methods on this tree do not descend to the level of null. Rather, they stop when they see that the left child or right child of a node is null. This implementation works as long as you don't pass null to any place where a tree is expected.
Assume that null is a valid binary tree (formally, just the empty tree). In this implementation, you first check if the pointer is null before doing any operations on it (like checking for left/right children, etc.) This implementation works for any pointer to a tree. You can freely pass null pointers to methods that are expecting a tree.
Both ways work. The second implementation has the advantage of flexibility. You can pass null to anything that expects a tree and it will not raise an exception. The first implementation has the advantage of not wasting time descending to "child nodes" that are null and you don't have to use null checks at the beginning of every function/method which operates on a node. You simply have to do null checks for the children instead.
That depends on the application and is a question of definition.
Edit:
For example Wikipedia "defines" a BST as follows:
In computer science, a binary search tree (BST), sometimes also called an ordered or sorted binary tree, is a node-based binary tree data structure which has the following properties:
The left subtree of a node contains only nodes with keys less than the node's key.
The right subtree of a node contains only nodes with keys greater than the node's key.
The left and right subtree must each also be a binary search tree.
There must be no duplicate nodes
Let's test those for null:
left subtree doesn't exist, so there are no nodes violating this rule -> check
similiar to first -> check
if all this tests are passed this passes too, since both subtrees are null -> check
of course there aren't duplicates -> check
So by this definition null is a valid BST. You could inverse this by also requiring "there must be one root node". Which doesn't affect any of the practical properties of a BST but might in an explicit application.
Ex falso sequitur quodlibet - since "null" is nothing at all, it can be interpreted to be anything. It is really a matter of design. Some people may claim that checkSubTree() should thrown something like an IllegalArgumentException in this case. Another approach would be to introduce a special kind of typed object or instance which represents an empty tree (cf. NullObjectPattern). Such a null-object would be a tree by all accounts, e.g. EmptyTree instanceof Tree, while null instanceof Tree would always be false.
Consider the following binary tree: (taken from here)
Given that leaf nodes will either be true or false, how can I find the branch (or branches) where all of the leaf nodes are true?
So if only 8, 5, 6 or 7 are true, then the first branch wouldn't match (it would need 9 to be true to match), but the second branch would match as all of its leaves are true.
Even identifying this name for this type of search would help so I can Google it.
You can use a recursive function, diving deeper into the tree and determining bottom-up whether for a certain branch all leaves are true. Those branches can then be stored in some list.
Here's some Python code. Call this function with the tree's root node as the first parameter and an empty list as the second, and the list will be populated with the correct branches.
def allTrue(node, trueList=[]):
if isLeaf(node):
return node.value == True
else:
leftTrue = allTrue(node.left, trueList)
rightTrue = allTrue(node.right, trueList)
bothTrue = leftTrue and rightTrue
if bothTrue:
trueList.append(node)
return bothTrue
One thing to look out for: Many programming languages try to be clever, or lazy, by not evaluating the second argument of x and y, if x is false already. In this case, however, this would result in not visiting the right branch if the left branch is not all-true, missing out some all-true branches. Thus the recursive calls better go to separate lines.
You can make a post-order tree traversal and mark each non-leaf node x true if both children have all leafs of their sub-tree true and false otherwise (it's an AND operation of the children's label). This way you recursevly mark the whole tree with your desired property.
Given: list of N nodes. Each node consists of 2 numbers: nodeID and parentID. parentID may be null (if it's a root node).
Is there an algorithm for recreating a tree from this list of nodes with time complexity better than O(N^2)?
Each node may have 0 or more children.
Short description of an algorithm with O(N^2) complexity:
find a root Node, put it to a Queue
while Queue is not empty
parentNode = Queue.pop()
loop through nodes
if currentNode.parentId = parentNode.id
parentNode.addChild(currentNode)
queue.push(currentNode)
nodes.remove(currentNode)
It seems that this algorithm has O(N^2) time complexity (with small coefficient, maybe 0.25). But I may be wrong at complexity calculation here.
Since you've already got an external structure to the tree (a queue), I'm going to assume you don't mind using a bit of extra memory to get the job done faster.
Do it in two conceptual steps with a hash table:
First make a hash table that relates node IDs to their actual node.
Then look up a node's parent based on its parent's ID in the hash table and add the child to that parent.
More programatically:
for each node
add node to hash table indexed by node's parent
for each node
if parent is null set node as the root
otherwise look up in the hash table the parent from the parent ID of the node
add the node as a child of the found parent
The only potential issue with this technique is you don't necessarily end up with a valid tree until the very end. (That is, the root node may not have a child until the last link.) Depending on what you're doing with your tree, this may be an issue.
If that is an issue you can end up doing the same operation with a data structure that doesn't have that issue (just a vanilla tree with no attached data) and then mirror the structure.
All in all, this should be O(N) on the average.
For each node initialize a list of children and for each node update the parent's children list with itself. Complexity O(n).
For node in NodeList:
node.childList = []
For node in NodeList:
if node.parent is not NULL:
node.parent.childList.append(&node)
If the parent link is not readily available, then create a hash map. FWIW, the best worst case complexity of hash mapping is O(logn) for each insertion or lookup. So the final complexity becomes O(nlogn).
I do not know what your input is, but let's assume that it is some sort of unordered list.
You can then create a tree structure by just putting them into a data structure that allows looking them up by their nodeID. For example, an array would do that. You then have a tree that is only linked in the direction of the parents. Transformation from an unordered list into this array is possible in linear time, assuming that the nodeIDs are unique.
In order to get the tree also linked in the direction of the children, you can prepare the nodes with a data structure (e.g. a list) to hold the children, and then do a second pass that adds each node to its parent's children list. This is also possible in linear time.