In binary trees, are sibling nodes necessarily ordered? - data-structures

Just been learning about binary trees in school, and two rules of binary trees is that
every node has at most 2 child nodes
there exists linear ordering defined for the children of each node (ordered pair)
Now, all types of binary trees (full, complete, etc.) are binary trees so they must satisfy these 2 conditions.
However, I saw on GeeksForGeeks this example:
How is 'linear ordering', ordered pair, defined here?
It seems that for the sibling nodes in this picture, some left ones are larger than the right one, some right nodes are larger than the left one.
If asked to check if a given tree is a binary tree, how do I make sure of the second property, that the children of each node have to be ordered?
Thanks

This is one of the complicated ways to introduce a binary tree.
two rules of binary trees is that
every node has at most 2 child nodes
there exists linear ordering defined for the children of each node (ordered pair)
Simple ways of introducing binary trees I could think of are "at most two children and no cycles" or "at most two children and unique path between any pair of vertices".
But fine. You bring up the point of linear order. Lets discuss that.
Here
A linear ordering on a finite collection of objects may be described
as follows: each object has exactly one immediate predecessor object
and one immediate successor object with two exceptions: A first object
has no predecessor and a last object has no successor.
If you have learnt about traversal so far, with the above definition, I would take binary tree traversals as linear order - preorder, postorder, inorder, level order. This applies to all types of binary trees (full, complete, etc.) which includes the complete binary tree you posted as an image.

Related

How to check if two binary trees share a node

Given an array of binary trees find whether any two trees share a node, not value wise, but "pointer" wise. At the bottom I provided an example.
My approach was to iterate through all the trees and store all the leaves (pointers) from each tree into a list, then check if list has any duplicates, but that's a rather slow approach. Is there perhaps a quicker way to solve this?
In the worst case you will have to traverse all nodes (all pointers) to find a shared node (pointer), as it might happen to be the last one visited. So the best time complexity we can expect to have is O(𝑚+𝑛) where 𝑚 and 𝑛 represent the number of nodes in either tree.
We can achieve this time complexity if we store the pointers from the first tree in a hash set and then traverse the pointers of the second tree to see if any of those is in the set. Assuming that get/set operations on a hash set have an amortized constant time complexity, the overal time complexity will be O(𝑚+𝑛).
If the same program is responsible for constructing the trees, then a reuse of the same node can be detected upon insertion. For instance, reuse of the same node in multiple trees can be completely avoided by having the insert method of your tree only take a value as argument, never a node instance. The method will then encapsulate the actual creation of the node, guaranteeing its uniqueness.
An idea for O(#nodes) time and O(1) space. It does more traversal work than simple traversals using a hash table, but it doesn't have the cost of using a hash table. I don't know what's better. Might depend on the language.
For two trees
Create one extra node. Do a Morris traversal of the first tree. It only modifies right child pointers, so we can use left child pointers for marking nodes as seen. For every tree node without left child, set our extra node as left child. Whenever checking a left child pointer, treat our extra node like a null pointer, i.e., don't visit it. After the traversal, the tree structure is restored, and all originally left-child-less tree nodes now point to our extra node as left child. That includes all leaf nodes.
Do a Morris traversal of the second tree. Again treat pointers to our extra node like null pointers. If we ever do encounter our extra node, we know the trees share a node. If not, then we know the trees don't share a node, since if they did share any, they'd also share a leaf node (just go down from any shared node to a leaf node, that's also shared), and all leafs nodes of the first tree are marked. After the traversal, the second tree is restored.
Do a Morris traversal of the first tree again, this time removing our extra node, restoring the original null pointers.
For an array of more than two trees
Mark the first tree as above. Check the second tree as above. Mark the second tree. Check the third. Mark the third. Check the fourth. Mark the fourth. Etc. When you found a shared node or there are no more trees, unmark the marked trees.
Every shared node must have two parents, or an ancestor with two parents.
LOOP over nodes
IF node has two parents
MARK node as shared
Mark all descendants as shared.

How would you keep an ordinary binary tree (not BST) balanced?

I'm aware of ways to keep binary search trees balanced/self-balancing using rotations.
I am not sure if my case needs to be that complicated. I don't need to maintain any sorted order property like with self-balancing BSTs. I just have an ordinary binary tree that I may need to delete nodes or insert nodes. I need try to maintain balance in the tree. For simplicity, my binary tree is similar to a segment tree, and every time a node is deleted, all the nodes along the path from the root to this node will be affected (in my case, it's just some subtraction of the nodal values). Similarly, every time a node is inserted, all the nodes from the root to the inserted node's final location will be affected (an addition to nodal values this time).
What would be the most straightforward way to keep a tree such as this balanced? It doesn't need to be strictly as height balanced as AVL trees, but something like RB trees or maybe slightly less balanced is acceptable as well.
If a new node does not have to be inserted at a particular spot -- possibly determined by its own value and the values in the tree -- but you are completely free to choose its location, then you could maintain the shape of the tree as a complete tree:
In a complete binary tree every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible.
An array is a very efficient data structure for a complete tree, as you can store the nodes in their order in a breadth-first traversal. Because the tree is given to be complete, the array has no gaps. This structure is commonly used for heaps:
Heaps are usually implemented with an array, as follows:
Each element in the array represents a node of the heap, and
The parent / child relationship is defined implicitly by the elements' indices in the array.
Example of a complete binary max-heap with node keys being integers from 1 to 100 and how it would be stored in an array.
In the array, the first index contains the root element. The next two indices of the array contain the root's children. The next four indices contain the four children of the root's two child nodes, and so on. Therefore, given a node at index i, its children are at indices 2i + 1 and 2i + 2, and its parent is at index floor((i-1)/2). This simple indexing scheme makes it efficient to move "up" or "down" the tree.
Operations
In your case, you would define the insert/delete operations as follows:
Insert: append the node to the end of the array. Then perform the mutation needed to its ancestors (as you described in your question)
Delete: replace the node to be deleted with the node that currently sits at the very end of the array, and shorten the array by 1. Make the updates needed that follow from the change at these two locations -- so two paths from root-to-node are impacted.
When balancing non-BSTs, the big question to ask is
Can your tree efficiently support rotations?
Some types of binary trees, like k-d trees, have a specific layer-by-layer structure that makes rotations infeasible. Others, like range trees, have auxiliary metadata in each node that's expensive to update after a rotation. But if you can handle rotations, then you can use just about any of the balancing strategies out there. The simplest option might be to model your tree on a treap: put a randomly-chosen weight field into each node, and then, during insertions, rotate your newly-added leaf up until its weight is less than its parent. To delete, repeatedly rotate the node with its lighter child until it's a leaf, then delete it.
If you cannot support rotations, you'll need a rebalancing strategy that does not require them. Perhaps the easiest option there is to model your tree after a scapegoat tree, which works by lazily detecting a node that's too deep for the tree to be balanced, then rebuilding the smallest imbalanced subtree possible into a perfectly-balanced tree to get everything back into order. Deletions are handled by rebuilding the whole tree once the number of nodes drops by some constant factor.

Joining of binary trees

Suppose we have a set of binary trees with their inorder and preorder traversals given,and where no tree is a subtree of another tree in the given set. Now another binary tree Q is given.find whether it can be formed by joining the binary trees from the given set.(while joining each tree in the set should be considered atmost once) joining operation means:
Pick the root of any tree in the set and hook it to any vertex of another tree such that the resulting tree is also a binary tree.
Can we do this using LCA (least common ancestor)?or does it needs any special datastructure to solve?
I think a Binary tree structure should be enough. I don't think you NEED any other special data structure.
And I don't understand how you would use LCA for this. As far as my knowledge goes, LCA is used for knowing the lowest common Ancestor for two NODES in the same tree. It would not help in comparing two trees. (which is what I would do to check if Q can be formed)
My solution in words.
The tree Q that has to be checked if it can be made from the set of trees, So I would take a top-down approach. Basically comparing Q with the possible trees formed from the set.
Logic:
if Q.root does not match with any of the roots of the trees in the set (A,B,C....Z...), No solution possible.
if Q.root matches a Tree root (say A) check corresponding children and mark A as used/visited. (Which is what I understand from the question: a tree can be used only once)
We should continue with A in our solution only if all of Q's children match the corresponding children of A. (I would do Depth First traversal, Breadth First would work as well).
We can add append a new tree from the set (i.e. append a new root (tree B) only at leaf nodes of A as we have to maintain binary tree). Keep track of where the B was appended.
Repeat same check with corresponding children comparison as done for A. If no match, remove B and try to add C tree at the place where B was Added.
We continue to do this till we run out of nodes in Q. (unless we want IDENTICAL MATCH, in which case we would try other tree combinations other than the ones that we have, which match Q but have more nodes).
Apologies for the lengthy verbose answer. (I feel my pseudo code would be difficult to write and be riddled with comments to explain).
Hope this helps.
An alternate solution: Will be much less efficient (try only if there are relatively less number of trees) : forming all possible set of trees ( first in 2s then 3s ....N) and and Checking the formed trees if they are identical to Q.
the comparing part can be referred here:
http://www.geeksforgeeks.org/write-c-code-to-determine-if-two-trees-are-identical/

Difference between ordered and unordered (rooted) trees

I am reading algorithms by Robert Sedwick. Some definitions from the book are shown below.
A tree (also an ordered tree) is a node (called the root) connected to
a sequence of disjoint trees. Such a sequence is called a forest.
A rooted tree (or unordered tree) is a node (called the root)
connected to a multiset of rooted trees. (such a multiset is called an
unordered forest.
My questions on the above text are
I am having difficutlty in understanding above defintions. Can any one please explain with examples.
What does author mean by disjoint trees?
What does author mean by multiset rooted trees?
Thanks for your time and help
A tree by this definition is more or less what we normally understand by tree: a node connected to an ordered sequence of (sub)trees. It's a recursive definition: if the sequence is empty, the node is called leaf, and if not, each of the tree in the sequence is also "a node connected to an ordered sequence of (sub)trees."
By disjoint the author means that the subtrees don't have nodes in common.
The definition means that the subtrees of a rooted tree are not in a particular order and that they may be repeated. A multiset is kind of like a set that allows multiples.
An ordered tree (a "tree" of the first definition) has the subtrees in a particular order, and the subtree sequence cannot contain the same tree twice, because the subtrees must be disjoint. A rooted tree does not have these restrictions; by this definition a root might have a subtree twice, in a structure that resembles a cycle.
I don't have Sedwick's book to check if or why this definition makes sense; a more common definition or rooted tree would use a normal set for subtrees, rather than a multiset. Perhaps the intention is to allow multiple links between a node and its children while disallowing other kinds of cycles, such as links between siblings and cousins.

Complete binary tree definitions

I have some questions on binary trees:
Wikipedia states that a binary tree is complete when "A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible." What does the last "as far left as possible" passage mean?
A well-formed binary tree is said to be "height-balanced" if (1) it is empty, or (2) its left and right children are height-balanced and the height of the left tree is within 1 of the height of the right tree, taken from How to determine if binary tree is balanced?, is this correct or there's "jitter" on the 1-value? I read on the answer I linked that there could be also a difference factor of 4 between the height of the right and the left tree
Do the complete and height-balanced definitions just apply to binary tree or just any other tree?
Following the reference of the definition in wikipedia, I got to
this page. The definition was taken from there but modified:
Definition: A binary tree in which every level, except possibly the deepest, is completely filled. At depth n, the height of the
tree, all nodes must be as far left as possible.
It continues with a note below though,
A complete binary tree has 2k nodes at every depth k < n and between 2n and 2^(n+1) - 1 nodes altogether.
Sometimes, definitions vary according to convenience (be useful for something). That passage might be a variation which, as I understand, requires leaf nodes to fill first the left side of the deepest level (that is, fill from left to right). The definition that I usually found is exactly as described above but without that
passage.
Usually the definition taken for height-balanced tree is the one you
described. In other words:
A tree is balanced if and only if for every node the heights of its two subtrees differ by at most 1.
That definition was taken from here. Again, sometimes definitions are made more flexible to serve specific purposes. For example, the definition of an AVL tree says that
In an AVL tree, the heights of the two child subtrees of any node
differ by at most one
Still, I remember once I had to rewrite an algorithm so that the tree
would be considered height-balanced if the two child subtrees of any
node differed by at most 2. Note that the definition you gave is recursive, this is very common for binary trees.
In a tree whose number of children is variable, you wouldn't be able to say that it is complete (any parent could have the number of children that you want). Still, it can apply to n-ary trees (with a fixed amount of n children).
Do the complete and height-balanced definitions just apply to binary
tree or just any other tree?
Short answer: Yes, it can be extended to any n-ary tree.

Resources