I read in introduction to algorithms that it is not clear what the expected height of a randomized binary tree is. Here the "randomized binary tree" meant a tree randomly contructed by randomly inserting and deleting nodes. May I ask what is the essential difficulty for this problem? The original wording is:
"
Unfortunately, little is known about the average height of a binary search tree
when both insertion and deletion are used to create it." (Chapter 12)
All answers are welcome (my CS level is really low)
Related
The below images show a Union find Problem solved by rank with path compression. If you don't understand my handwriting then read the description below to understand what I have done.
Description:
First I have done Union of 1 and 2. Second, Union(3,4). Third Union(5,6) and so on by comparing their ranks while also doing path compression when finding the representative element of the tree whose union is to be done.
My Doubt:
My doubt is, If you look at the final tree in the image, you'll see a tree completely flat ( flat means I meant the tree's depth ). Will path compression always result in a flat tree no matter how many elements are present?
And Also how can we find the Union_find's time complexity with path compression?
It is possible to build inverse-trees of unlimited depth. E.g., if you happen to always choose the roots as your Union() arguments, then no paths will be compressed, and you can build your tree as tall as you like.
If (as your written notes suggest) you use rank to choose the root resulting from your union, then your trees will be balanced, so you will need Ω(2^n) operations to generate a tree of depth n. (Specifically: to build a tree of of depth n, first build two trees of depth n-1 and take the Union() of their roots.)
The amortized time complexity of union-find with rank-matching and path compression is known to be O(inverse_ackermann).
In this paper by Demain, et al: a new data structure is proposed where a tree of B-trees is implemented to achieve dynamic optimality. In the entire paper, it is assumed that the B = (logN)^O(1).
Similar to Tango trees, here also preferred paths are created using the previously accessed nodes. These preferred paths are stored as auxiliary b-trees, and a tree of these auxiliary trees is created. This dynamic tree is called Belga-B tree.
When we are searching for an access sequence, we can access a node of the b-tree and the higher the branching factor is, lesser nodes need to be accessed to search for a key. But the authors have put a limit on B = (logN)^O(1).
Also, they have mentioned that when B = (logN)^O(1), then
1 + logBlog(N) = O(logBlog(N) )
Why is this condition necessary for the algorithm to work?
I fail to understand the significance of this. We know that the more the branching factor is, less would be the height of the tree and hence less time of access. Why is there a restriction on the value of B?
Even when the algorithm is explained, nowhere have they used the fact that B is polynomial in logN.
I have read many definitions of a "heap" online, and I have also read the definition in CLRS. Most of the definitions online seem to say that heaps are complete binary trees; however, CLRS starts the heap chapter with the following sentence:
The (binary) heap data structure is an array object that we can view
as a nearly complete binary tree...
I'm not sure why, but it really bothers me that CLRS calls heaps "nearly complete," whereas almost every other definition of "heap" I've read calls heaps "complete."
This leads me to the following question: Is it possible to have a heap that isn't a complete binary tree?
You are absolutely right to be bothered by the expression "nearly complete". A heap is a complete binary tree, according to the most common terminology:
complete binary tree: all except the last level are fully occupied, and the leaves in the last
level appear at the left side of that level.
perfect binary tree: a complete binary tree where also the last level is completely occupied.
full binary tree: a binary tree where none of the nodes has just one child. Sometimes this term is used to denote a perfect binary tree, adding to the confusion.
A perfect binary tree is also a complete and a full binary tree, but a complete binary tree may or may not be a full binary tree.
But the Wikipedia article on Binary tree warns:
Some authors use the term complete to refer instead to a perfect binary tree [...] in which case they call this type of tree (with a possibly not filled last level) an almost complete binary tree or nearly complete binary tree.
So apparently the author of the text you refer to, falls into that category.
What exactly complete means? People have different opinions. In context of heap, complete binary tree should mean last level of tree has maximum number of nodes.
Any heap not having maximum leaves in its last level is not complete or is nearly complete.
For example, a heap with 7 elements would be complete binary tree. But a heap with 4, 5 or 6 elements wouldn't have its last level completely filled i.e nearly complete.
A heap with Nearly complete binary tree of depth three (assuming depth of root node to be 1) looks like below:
Lots of tutorials focus on implementation of Binary Search Tree and it is easier for search operations. Are there applications or circumstances where implementing a Simple Binary Tree is better than BST? Or is it just taught as an introductory concept for trees?
You use a binary tree (rather than a binary search tree) when you have a structure that requires a parent and up to two children. For example, consider a tree to represent mathematical expressions. The expression (a+b)*c becomes:
*
/ \
+ c
/ \
a b
The Paring heap is a data structure that is logically a general tree (i.e. no restriction on the number of children a node can have), but it is often implemented using a left-child right-sibling binary tree. The LCRS binary tree is often more efficient and easier to work with than a general tree.
The binary heap also is a binary tree, but not a binary search tree.
The old guessing game where the player answers a bunch of yes/no questions in order to arrive at an answer, is another example of a binary tree. In the tree below, the left child is the "No" answer, and the right child is "Yes" answer
Is it an animal?
/ \
Is it a plant? Is is a mammal?
/ \
A reptile? A dog?
You can imagine an arbitrarily deep tree with questions at each level.
Those are just a few examples. I've found binary trees useful in lots of different situations.
What do we mean by length of binary tree - number of nodes, or height of tree?
Thank you
It is not a term I have seen used to describe the properties of a binary tree. I would guess someone using it would be referring to the depth.
I would personally think of 'length' as the height (depth), not the size (# of nodes) of the tree, but this is quite a contextual question.
Typically, 'length' refers to the number of items in the underlying data structure.
The height of the tree would be its 'depth'
I am going to argue that the n, number of nodes is the "best" answer.
Almost any recursively consistent measure might be argued as a potential answer, e.g. height. However, size of tree=n, number of nodes is the largest numerical answer.
Height of tree=log n, and the others will all be same or smaller numbers. So I conclude that node-count "should" be the length of a tree. It carries the most bits of information of the arguable possibilities.