Running time for binary search tree - algorithm

The textbook says the number of split operations is bounded by the height of the tree, which is O(logn).
I dont quite understand why it is bounded by the height of the tree? Can someone explain that?

When you start at the root, and go as far as you can down some path towards the bottom, the maximum number of nodes you can come across is equal to the height of the tree (this should be easy to see and it is, pretty much by definition, the height of the tree).
Now when you're searching in a binary search tree, you start at the root, and, at each step, you look at the current node, and stop, go left or go right (going left or going right can be considered a split operation). This process involves the same number of nodes as the one described above (going from the root down some path), which involves encountering a number of nodes, and thus split operations, no more than the height of the tree.
Also note that the height of the tree is only O(log n) if the tree is balanced (see this page for more).

Most probably, in the textbook you are referring to, the data structure in question in a balanced binary tree with n nodes. Since it is balanced, its height is log(n). Detailed definitions and brief explanations converning the height ca be found here.

Related

Definition of Full Binary Tree

I came upon two resources and they appear to say the basic definition in two ways.
Source 1 (and one of my professor) says:
All leaves are at the same level and all non-leaf nodes have two child nodes.
Source 2 (and 95% of internet) says:
A full binary tree (sometimes referred to as a proper or plane binary tree) is a tree in which every node in the tree has either 0 or 2 children.
Now following Source 2,
becomes a binary tree but not according to Source 1 as the leaves are not in the same level.
So typically they consider trees like,
as Full Binary Tree.
I may sound stupid but I'm confused what to believe. Any help is appreciated. Thanks in advance.
There are three main concepts: (1) Full binary tree (2) Complete binary tree and (3) Perfect binary tree. As you said, full binary tree is a tree in which all nodes have either degree 2 or 0. However, a complete binary tree is one in which all levels except possibly the last level are filled from left to right. Finally, a perfect binary tree is a full binary tree in which all leaves are at the same depth. For more see the wikipedia page
My intuition for the term complete here is that given a fixed number of nodes, a complete binary tree is made by completing each level from left to right except possibly the last one, as the number of nodes may not always be of the form 2^n - 1.
I think the issue is, what's the purpose of making the definition? Usually, the reason for defining full binary tree in the way that appears in Wikipedia is to be able to introduce and prove the Full Binary Tree Theorem:
The total number of nodes N in a full binary tree with I internal nodes is 2 I + 1.
(There are several equivalent formulations of this theorem in terms of the number of interior nodes, number of leaf nodes, and total number of nodes.) The proof of this theorem does not require that all the leaf nodes be at the same level.
What one of your professors is describing is something I would call a perfect binary tree, or, equivalently, a full, complete binary tree.

Height difference between leaves in an AVL tree

What is the maximum difference between any two leaves in an AVL tree? If I take an example, my tree becomes unbalanced, if the height difference is more than 2(for any two leaves), but the answer is the difference can be any value. I really don't understand, how this is possible.Can anyone explain with examples?
The difference in levels of any two leaves can be any value! Definition of AVL describes height difference only on two sub-trees from one node.
So you need to fill subtrees with equal height then add new nodes just to create that single node difference. But nobody said that that subtree doesn't contain some subtrees with the exact same definition. Of course tree is selfbalanced but if we'll be that accurate to not touch it's balance then we can create any height difference between some leaves.
Example with leaf 24 on level 3 and leaf 10 on level 6:
According to the explanation in this Wikipedia article, the balancing operations in an AVL tree successfully aim at rearranging the tree such that the height of any two leaves differs no more than one. This is the key property of the data structure which makes the retrieval of nodes efficient (namely logarithmic in the number of nodes of the tree, as a path from the root to a leaf is traversed in the worst case).

Complete binary tree definitions

I have some questions on binary trees:
Wikipedia states that a binary tree is complete when "A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible." What does the last "as far left as possible" passage mean?
A well-formed binary tree is said to be "height-balanced" if (1) it is empty, or (2) its left and right children are height-balanced and the height of the left tree is within 1 of the height of the right tree, taken from How to determine if binary tree is balanced?, is this correct or there's "jitter" on the 1-value? I read on the answer I linked that there could be also a difference factor of 4 between the height of the right and the left tree
Do the complete and height-balanced definitions just apply to binary tree or just any other tree?
Following the reference of the definition in wikipedia, I got to
this page. The definition was taken from there but modified:
Definition: A binary tree in which every level, except possibly the deepest, is completely filled. At depth n, the height of the
tree, all nodes must be as far left as possible.
It continues with a note below though,
A complete binary tree has 2k nodes at every depth k < n and between 2n and 2^(n+1) - 1 nodes altogether.
Sometimes, definitions vary according to convenience (be useful for something). That passage might be a variation which, as I understand, requires leaf nodes to fill first the left side of the deepest level (that is, fill from left to right). The definition that I usually found is exactly as described above but without that
passage.
Usually the definition taken for height-balanced tree is the one you
described. In other words:
A tree is balanced if and only if for every node the heights of its two subtrees differ by at most 1.
That definition was taken from here. Again, sometimes definitions are made more flexible to serve specific purposes. For example, the definition of an AVL tree says that
In an AVL tree, the heights of the two child subtrees of any node
differ by at most one
Still, I remember once I had to rewrite an algorithm so that the tree
would be considered height-balanced if the two child subtrees of any
node differed by at most 2. Note that the definition you gave is recursive, this is very common for binary trees.
In a tree whose number of children is variable, you wouldn't be able to say that it is complete (any parent could have the number of children that you want). Still, it can apply to n-ary trees (with a fixed amount of n children).
Do the complete and height-balanced definitions just apply to binary
tree or just any other tree?
Short answer: Yes, it can be extended to any n-ary tree.

Why is avl tree faster for searching than red black tree?

I have read it in a couple of places that avl tree search faster, but not able to understand. As I understand :
max height of red-black tree = 2*log(N+1)
height of AVL tree = 1.44*logo(N+1)
Is it because AVL is shorter?
Yes.
The number of steps required to find an item depends on the distance between the item and the root.
Since the AVL tree is packed tighter (i.e. it has a lower max height) it means more items are closer to the root than in the red-black case.
The extra tight packing also means the AVL tree requires more work when inserting elements.
The best choice for any app depends on whether it is insert intensive or search intensive...
AVL tree is better than red-black tree if the input key is almost ascending/descending because then we would have to do single rotation(left-left or right-right case) to add this element. Also, since the tree would be tightly balanced, the search would also be faster.
But for randomly selected input key, RBTree are better since they require less rotation for insertion in comparison to AVL.
Overall, it depends on the input sequence, which would decide how tilted our tree is, and the operation performed.For insert-intensive use Red-Black Tree and for search-intensive use AVL.
AVL tree and RBTree do have respective advantages as well as disadvantages. You'll perceive that better if you've already learned how they work.
AVL is slightly faster than RBTree in insert operation because there would be at most one rotation involved in insertion, while there may be two for RBTree.
RBTree only require at most three rotations in deletion, but this is not guaranteed in AVL. So it can delete nodes faster than AVL.
However, above all, they both have strict logarithmic tree height.
Pick up any subtree, the property that makes AVL "balanced" guarantees that the difference of height between two child subtrees is at most one, which is to say, intuitively, the whole tree is rigidly balanced.
But when it comes to an RBTree, the rule becomes likely "looser", since property of RBTree can only guarantee the depth of a tree is not larger than twice as the logarithm of the total number of nodes.
Here're some facts that may be more precise:
An AVL tree's height is strictly less than: 1.44log(n+2)-0.328
(approximately)
A red-black tree's height is at most 2log(n+1)
See https://en.wikipedia.org/wiki/AVL_tree#Comparison_to_other_structures for detailed information.

Split 2-3 tree into less-than and greater-than given value X

I need to write function, which receives some key x and split 2-3 tree into 2 2-3 trees. In first tree there are all nodes which are bigger than x, and in second which are less. I need to make it with complexity O(logn). thanks in advance for any idea.
edited
I thought about finding key x in the tree. And after split its two sub-trees(bigger or lesser if they exist) into 2 trees, and after begin to go up and every time to check sub-trees which I've not checked yet and to join to one of the trees. My problem is that all leaves must be at the same level.
If you move from the root to your key and split each node so one points at the nodes larger than the key and the other at the rest and then make the larger node be a part of your larger tree, say by having the leftmost node at one level higher point at it, (don't fix the tree yet, do it at the end) until you reach the key you will get your trees. Then you just need to fix both trees on the path you used (note that the same path exists on both trees).
Assuming you have covered 2-3-4 trees in the lecture already, here is a hint: see whether you can apply the same insertion algorithm for 2-3 trees also. In particular, make insertions always start in the leaf, and then restructure the tree appropriately. When done, determine the complexity of the algorithm you got.

Resources