Is the root node an internal node? - data-structures

So I've looked around the web and a couple of questions here in stackoverflow here are the definition:
Generally, an internal node is any node that is not a leaf (a node with no children)
Non-leaf/Non-terminal/Internal node – has at least one child or descendant node with degree not equal to 0
As far as i understand it, it is a node which is not a leaf.
I was about to conclude that the root is also an internal node but there seems to be some ambiguity on its definition as seen here:
What is an "internal node" in a binary search tree?
As the wonderful picture shows, internal nodes are nodes located between the root of the tree and the leaves
If we follow that definition then the root node isn't going to be counted as an internal node. So is a root node an internal node or not?

Statement from a book : Discrete Mathematics and Its Applications - 7th edition By Rosen says,
Vertices that have children are called internal vertices. The root is an internal vertex unless it is the only vertex in the graph, in which case it is a leaf.
Supportive Theorem:
For any positive integer n, if T is a full binary tree with n internal vertices, then T
has n + 1 leaves and a total of 2n + 1 vertices.
case 1:
O <- 1 internal node as well as root
/ \
O O <- 2 Leaf Nodes
case 2: Trivial Tree
O <- 0 internal vertices (no internal vertices) , this is leaf

IMHO when you are talking about a tree with more than one node we can say the root node is an internal node. When there is only one node (the root node) the question of internal node doesn't arise. Hence we can vacuously say it is an internal node.

Yes root node is an internal node.
[More explanation]
A root node is never called as a leaf node even if it is the only node present in the tree.
For ex. if a tree has only one node then we say that it is a tree with only root node, we never say that the tree has a single leaf node.
Since internal node means a non-leaf node and because root node is never considered as leaf node I would say that in case of single node tree root node is an internal node.

"A node with no children is a leaf or external node. A non-leaf node is an internal node."
Source: "Introduction To Algorithms-3rd edition" page number 1176, last line.
So, root is also an internal node except when it is the only node of the tree.

Related

how to merge tree nodes into one node

Hopefully, my question is not duplicated.
I would like to know if there exists such algorithm which merges some nodes in a tree to a new tree so the node in the new tree consists of some nodes in the old tree?
In order to explain my idea, I drew a graph to explain the question.
Input: A original tree.
Output: A new tree. There are following conditions with which the new tree must be satisfied:
The number of nodes in the new tree should be a fixed number k.
Each node in the new tree must consists of nodes in the original tree. For example, the node A in the second graph contains node 1,3, and 4 of the first graph. Node D in the secod graph contains nodes 9,12, and 13 in the first graph.
if one node of the original tree is contained in a node of the new tree, it cannot appear in another node of the new tree.
The nodes in the new tree are not necessarily have to be a subtree of the original tree. For example, node C in the second graph contains 6,7,and 10 of the first graph, It is not a subtree of the original graph. Because both node 6 and node 7 in the original graph connect to the nodes in the dotted area of A in the original tree, So they could be grouped in the node C of the second graph.
Currently, I just want the original tree can be converted to a new tree that has a K number of nodes and meets above conditions. For a given tree, there are many solutions. For example, graph 3 and graph 4 illustrate another solution for the original tree. It also has 4 nodes.
You need an addition condition or desired property of your output, otherwise it is quite trivial:
Starting from leaf nodes, copy K - 1 nodes to the output: B = {11}, C = {12}, D = {13}
Group all other nodes into a single K-th node: A = {1,2,3,4,5,6,7,8,9,10}

Recommmend a proper data structure

I have an algorithm problem that needs binary tree structure similar to a binary tree. but the difference is that it may have nodes apart from the original tree independently.
And each node has three types. The first type is to point out starting node and only one exists. The second type is to point out connecting node and of course, and the last type is to point out a leaf node. Each edge has a cost to traverse to its bottom node.
Which data structure is good for me to cost to reach each node?
UPDATE
OK, I questioned this with data-structure tag so that I want to avoid to explain what the problem is. But inevitably, I explain about the problem because of lack of my explaination and my poor English.
I have nodes lists and edges with costs. There is a starting node(root node), nodes where will be located in the middle of a tree and leaf nodes are the destination for my program to traverse starting from a root node. But some of the leaf nodes may be ignored depending on the value in it. It is not important anyway. I have to calculate all leaf nodes' cost to reach its node from the root node and get the maximum value for them. Now, The problem is to adjust the cost value in edges for all other leaf nodes to have the same total cost with the maximum cost. But the sum of the adjust values has to be the minumum.

What is node in a tree?

According to wiki, everything in a tree is a node.
Terminologies used in Trees
Root – The top node in a tree.
Parent – The converse notion of child.
Siblings – Nodes with the same parent.
Descendant – a node reachable by repeated proceeding from parent to child.
Ancestor – a node reachable by repeated proceeding from child to parent.
Leaf – a node with no children.
Internal node – a node with at least one child.
External node – a node with no children.
Degree – number of sub trees of a node.
Edge – connection between one node to another.
Path – a sequence of nodes and edges connecting a node with a descendant.
Level – The level of a node is defined by 1 + the number of connections between the node and the root.
Height of tree –The height of a tree is the number of edges on the longest downward path between the root and a leaf.
Height of node –The height of a node is the number of edges on the longest downward path between that node and a leaf.
Depth –The depth of a node is the number of edges from the node to the tree's root node.
Forest – A forest is a set of n ≥ 0 disjoint trees.
But then I find the following picture from SAP http://www.sapdesignguild.org/community/design/print_hierarchies2.asp
So my question - is it right to call root, leaf, parents, children, siblings in a tree as nodes?
Yes. The root is "the root node." A parent is a "parent node." A leaf is a "leaf node." The tree is made up of nodes. The terms root, parent, child, sibling, leaf, etc. just describe the relationships among nodes.
For example, the root node has no parent. Leaf nodes have no children. Sibling nodes share the same parent.

Number of subtrees of root node in a B tree

The definition of a B tree I have read in various of books all contains the following
Every node except the root node has to be at least half full
If the root node is an index node, it must have at least two children.
I presume that the second special case is to allow a B tree to have, say, only one key and still be valid. However, if the B tree has many nodes, is it still allowed for the root node to have only two subtrees? Won't this break the guarantee of B tree like easy splitting and joining operation?
However, if the B tree has many nodes, is it still allowed for the root node to have only two subtrees?
Yes, the root is special-cased because every other internal node has siblings that it can merge with.
Suppose that we delete a key and that, as a result, some internal node has too few children. We have two options in the usual B-tree algorithms: have this node take some children from its siblings or just merge siblings outright (possibly propagating the deficiency toward the root). Neither is an option for the root, so we just exempt it from the minimum children requirement. This increases the max height for a given number of keys by at most one, so the asymptotic running time of operations is unaffected.

Binary tree visit: get from one leaf to another leaf

Problem: I have a binary tree, all leaves are numbered (from left to right, starting from 0) and no connection exists between them.
I want an algorithm that, given two indices (of 2 distinct leaves), visits the tree starting from the greater leaf (the one with the higher index) and gets to the lower one.
The internal nodes of the tree do not contain any useful information.
I should chose the path based only on the leaves indices. The path start from a leaf and terminates on a leaf, and of course I can access a leaf if I know its index (through an array of pointers)
The tree is static, no insertion or deletion of nodes is allowed.
I have developed an algorithm to do it but it really sucks... any ideas?
One option would be to find the least common ancestor of the two nodes, along with the sequence of nodes you should take from each node to get to that ancestor. Here's a sketch of the algorithm:
Starting from each node, walk back up to that node's parent until you reach the root. Count the number of nodes on the path from each node to the root. Let the height of the first node be h1 and the height of the second node be h2.
Let h = min(h1, h2). This is the height of the higher of the two nodes.
Starting from each node, keep following the node's parent pointer until both nodes are at height h. Record the nodes you followed during this step. At this point, both nodes are at the same height.
Until you find a common node, keep marching upwards from each node to its parent. Eventually you will hit their common ancestor. At this point, follow the path from the first node up to this ancestor, then down the path from the ancestor down to the second node.
In the worst case, this takes O(h) time and O(h) space, where h is the height of the tree. For a balanced binary tree is this O(lg n) time and space, which is quite good.
If you're interested in a Much More Hardcore version of this algorithm, consider looking into Tarjan's Least Common Ancestors algorithm, which with linear preprocessing time, can be used to find the least common ancestor much more rapidly than this.
Hope this helps!
Distance between any two nodes can be calculated with the help of lowest common ancestor:
Dist(n1, n2) = Dist(root, n1) + Dist(root, n2) - 2*Dist(root, lca)
where lca is lowest common ancestor.
see this for more help about this algorithm and see this video for learning how to calculate lca.

Resources