Learning about properties of Trees - data-structures

I'm trying to learn about Trees and their uses. One part of my class slide said that one definition of a tree is:
The height (maximum distance) of a tree is the maximum level among all of the nodes in the tree.
Node quit sure what that means, can anyone explain? Thanks

The height of a tree is the length of the longest downward path to a leaf from the root.

Related

Is RBT always full?

As I understand, binary trees do not have to be full. However, it seems that RBTs have to be full (sometimes children are NIL). Is that true, or am I missing something?
The path from the root node to all leaf nodes of any given red-black tree all have the same number of black nodes. In that sense I suppose you could say that red-black trees are always 'full' but I don't see that being a very useful definition.
The general idea of the red-black algorithm is to constrain the actual maximum difference in total height of leaf nodes (not just black height) between the leaf node with the shortest total path and the leaf node with the longest total path. If you use that as your basis then a RB tree is 'full' if all leaf nodes have the same total height (just as a regular binary tree is full if all leaves are at the same depth) and an RB tree does not have to be filled.
No. Red Black Trees are not always full. In fact that's a seldom event. You can learn more about it by reading the book Introduction to Algorithms (Cormen, page 308), 3rd Edition (it has some figures illustrating the answer at page 310, i am not showing them because copyright).

Height difference between leaves in an AVL tree

What is the maximum difference between any two leaves in an AVL tree? If I take an example, my tree becomes unbalanced, if the height difference is more than 2(for any two leaves), but the answer is the difference can be any value. I really don't understand, how this is possible.Can anyone explain with examples?
The difference in levels of any two leaves can be any value! Definition of AVL describes height difference only on two sub-trees from one node.
So you need to fill subtrees with equal height then add new nodes just to create that single node difference. But nobody said that that subtree doesn't contain some subtrees with the exact same definition. Of course tree is selfbalanced but if we'll be that accurate to not touch it's balance then we can create any height difference between some leaves.
Example with leaf 24 on level 3 and leaf 10 on level 6:
According to the explanation in this Wikipedia article, the balancing operations in an AVL tree successfully aim at rearranging the tree such that the height of any two leaves differs no more than one. This is the key property of the data structure which makes the retrieval of nodes efficient (namely logarithmic in the number of nodes of the tree, as a path from the root to a leaf is traversed in the worst case).

Running time for binary search tree

The textbook says the number of split operations is bounded by the height of the tree, which is O(logn).
I dont quite understand why it is bounded by the height of the tree? Can someone explain that?
When you start at the root, and go as far as you can down some path towards the bottom, the maximum number of nodes you can come across is equal to the height of the tree (this should be easy to see and it is, pretty much by definition, the height of the tree).
Now when you're searching in a binary search tree, you start at the root, and, at each step, you look at the current node, and stop, go left or go right (going left or going right can be considered a split operation). This process involves the same number of nodes as the one described above (going from the root down some path), which involves encountering a number of nodes, and thus split operations, no more than the height of the tree.
Also note that the height of the tree is only O(log n) if the tree is balanced (see this page for more).
Most probably, in the textbook you are referring to, the data structure in question in a balanced binary tree with n nodes. Since it is balanced, its height is log(n). Detailed definitions and brief explanations converning the height ca be found here.

What is the name (if any) for this kind of tree?

I have this tree which, for each node, has exactly 10 childnodes (0-9). Each node has some associated data (say, for example, a name and a tag and a color) which, I guess, isn't important for this question. Each of the childnodes has exactly 10 childnodes. A node can be null (which 'ends' the branch') or contain another node.
To visualize what I'm talking about I made this diagram (fear my paintz0r skillz!):
A black box is a null-node. A white box is a node which contains data and childnodes. As you can see, even the root, each node has exactly 10 childnodes. Because of simplicity and to keep the diagram sane I have drawn some nodes very tiny but you can imagine these tiny nodes being the same.
This structure allows me to traverse a path consisting of digits very quickly: a path of 47352 would lead me down the "orange path" to the final destination; 4->7->3->5 where the final 2 cannot be resolved because that last one is a null-node (although colored red) and contains no childnodes.
My question is pretty simple actually: what is this kind of tree called? I have gone through all trees on Wikipedia's Tree (data structure) lemma and the closest I (think I) could get is the Octree and/or K-ary tree. Along those lines of reasoning my tree would be called a Dectree, Decitree, 10-ary tree or 10-way tree or something. But there might be a better name for this. So: anyone?
K-ary tree with K=10
In graph theory, a k-ary tree is a rooted tree in which each node has
no more than k children
It is also sometimes known as a k-way tree, an N-ary tree, or an M-ary
tree. A binary tree is the special case where k=2.
This is something like B-Tree.

Binary tree visit: get from one leaf to another leaf

Problem: I have a binary tree, all leaves are numbered (from left to right, starting from 0) and no connection exists between them.
I want an algorithm that, given two indices (of 2 distinct leaves), visits the tree starting from the greater leaf (the one with the higher index) and gets to the lower one.
The internal nodes of the tree do not contain any useful information.
I should chose the path based only on the leaves indices. The path start from a leaf and terminates on a leaf, and of course I can access a leaf if I know its index (through an array of pointers)
The tree is static, no insertion or deletion of nodes is allowed.
I have developed an algorithm to do it but it really sucks... any ideas?
One option would be to find the least common ancestor of the two nodes, along with the sequence of nodes you should take from each node to get to that ancestor. Here's a sketch of the algorithm:
Starting from each node, walk back up to that node's parent until you reach the root. Count the number of nodes on the path from each node to the root. Let the height of the first node be h1 and the height of the second node be h2.
Let h = min(h1, h2). This is the height of the higher of the two nodes.
Starting from each node, keep following the node's parent pointer until both nodes are at height h. Record the nodes you followed during this step. At this point, both nodes are at the same height.
Until you find a common node, keep marching upwards from each node to its parent. Eventually you will hit their common ancestor. At this point, follow the path from the first node up to this ancestor, then down the path from the ancestor down to the second node.
In the worst case, this takes O(h) time and O(h) space, where h is the height of the tree. For a balanced binary tree is this O(lg n) time and space, which is quite good.
If you're interested in a Much More Hardcore version of this algorithm, consider looking into Tarjan's Least Common Ancestors algorithm, which with linear preprocessing time, can be used to find the least common ancestor much more rapidly than this.
Hope this helps!
Distance between any two nodes can be calculated with the help of lowest common ancestor:
Dist(n1, n2) = Dist(root, n1) + Dist(root, n2) - 2*Dist(root, lca)
where lca is lowest common ancestor.
see this for more help about this algorithm and see this video for learning how to calculate lca.

Resources