Construction of B+ trees - algorithm

Suppose I am asked to construct a B+ tree, of:
i) n = x.
ii) order = x.
iii) degree = x.
iv) p = x.
What should the no. of keys, and pointers that each node can contain, in each of the above cases?

In B+ tree, Order denotes Maximum number of child pointers for each internal node, i.e. if Order of a B+ tree is m, then each internal node can have at most m children (subsequently, m-1 number of keys) and at least CEIL(m/2) number of children pointers (Except root).
For Degree of B+ tree, from this, I got the information that if d is the degree of a B-Tree, then each node can contain upto 2d items (keys). Now, both B tree and B+ tree are Multiway Tree, and hence, I suppose definition of degree will not change. Check the $LINK given as Comment also which indicates same fact.
For n, as JustinDanielson mentioned, it might be total number of keys stored in the node, for which number of children pointer would be n+1 (=x+1 for your question)

Related

Count nodes bigger then root in each subtree of a given binary tree in O(n log n)

We are given a tree with n nodes in form of a pointer to its root node, where each node contains a pointer to its parent, left child and right child, and also a key which is an integer. For each node v I want to add additional field v.bigger which should contain number of nodes with key bigger than v.key, that are in a subtree rooted at v. Adding such a field to all nodes of a tree should take O(n log n) time in total.
I'm looking for any hints that would allow me to solve this problem. I tried several heuristics - for example when thinking about doing this problem in bottom-up manner, for a fixed node v, v.left and v.right could provide v with some kind of set (balanced BST?) with operation bigger(x), which for a given x returns a number of elements bigger than x in that set in logarihmic time. The problem is, we would need to merge such sets in O(log n), so this seems as a no-go, as I don't know any ordered set like data structure which supports quick merging.
I also thought about top-down approach - a node v adds one to some u.bigger for some node u if and only if u lies on a simple path to the root and u<v. So v could update all such u's somehow, but I couldn't come up with any reasonable way of doing that...
So, what is the right way of thinking about this problem?
Perform depth-first search in given tree (starting from root node).
When any node is visited for the first time (coming from parent node), add its key to some order-statistics data structure (OSDS). At the same time query OSDS for number of keys larger than current key and initialize v.bigger with negated result of this query.
When any node is visited for the last time (coming from right child), query OSDS for number of keys larger than current key and add the result to v.bigger.
You could apply this algorithm to any rooted trees (not necessarily binary trees). And it does not necessarily need parent pointers (you could use DFS stack instead).
For OSDS you could use either augmented BST or Fenwick tree. In case of Fenwick tree you need to preprocess given tree so that values of the keys are compressed: just copy all the keys to an array, sort it, remove duplicates, then substitute keys by their indexes in this array.
Basic idea:
Using the bottom-up approach, each node will get two ordered lists of the values in the subtree from both sons and then find how many of them are bigger. When finished, pass the combined ordered list upwards.
Details:
Leaves:
Leaves obviously have v.bigger=0. The node above them creates a two item list of the values, updates itself and adds its own value to the list.
All other nodes:
Get both lists from sons and merge them in an ordered way. Since they are already sorted, this is O(number of nodes in subtree). During the merge you can also find how many nodes qualify the condition and get the value of v.bigger for the node.
Why is this O(n logn)?
Every node in the tree counts through the number of nodes in its subtree. This means the root counts all the nodes in the tree, the sons of the root each count (combined) the number of nodes in the tree (yes, yes, -1 for the root) and so on all nodes in the same height count together the number of nodes that are lower. This gives us that the number of nodes counted is number of nodes * height of the tree - which is O(n logn)
What if for each node we keep a separate binary search tree (BST) which consists of nodes of the subtree rooted at that node.
For a node v at level k, merging the two subtrees v.left and v.right which both have O(n/2^(k+1)) elements is O(n/2^k). After forming the BST for this node, we can find v.bigger in O(n/2^(k+1)) time by just counting the elements in the right (traditionally) subtree of the BST. Summing up, we have O(3*n/2^(k+1)) operations for a single node at level k. There are a total of 2^k many level k nodes, therefore we have O(2^k*3*n/2^(k+1)) which is simplified as O(n) (dropping the 3/2 constant). operations at level k. There are log(n) levels, hence we have O(n*log(n)) operations in total.

Minimum and Maximum number of nodes in a 2-3 Tree

I'm trying to find out what are the minimum and maximum number of nodes in a 2-3 Tree with n leaves.
I have tried blocking it with inf\sup but I couldnt go further then that the number of nodes in a 2-3 Tree is bigger then the number of nodes in a full-AVL tree.
Thanks in advance
Operating under the definition of a 2-3 tree at wikipedia:
In computer science, a 2–3 tree is a type of data structure, a tree where every node with children (internal node) has either two children (2-node) and one data element or three children (3-nodes) and two data elements. Nodes on the outside of the tree (leaf nodes) have no children and one or two data elements.
It appears to me that the maximum number of nodes in a tree will be when each internal node has 3 children. In order to find the maximum number of nodes in that tree, we must first find the height of the tree.
If there are n leaves in this 3 tree, then the height of the tree is height = log3(n) (log base 3 of n) and so the max number of items would be 3^height.
The smallest tree is one which has the smallest number of elements, which would be a tree with a single node.

Given a number n, how many balanced binary trees (not binary search trees) are there?

The definition of balanced in this question is
The number of nodes in its left subtree and the number of nodes in its
right subtree are almost equal, which means their difference is not
greater than one
if given a n as the number of nodes in total, how many are there such trees?
Also what if we replace the number of nodes with height? Given a height, how many height balanced trees are there?
Well the difference will be made only by the last level, hence you can just find how many nodes should be left for that one, and just consider all possible combinations. Having n nodes you know that the height should be floor(log(n)) hence the same tree at depth k = floor(log(n)) - 1 is fully balanced, hence you know that is needs (m = sum(i=0..k)2^i) nodes, hence n-m nodes are left for the last level. Some definition of a balanced binary tree force "all the nodes to be left aligned", in this case it is obvious that there can be only one possibility, without this constraint you have combinations of 2^floor(log(n)) chooses n-m, because you have to pick which of the 2^floor(log(n)) possible slots you will assign with nodes, forcing a total of n-m nodes to be assigned.
For the height story you consider a sum of combinations of 2^floor(log(n)) chooses i as i goes from 1 to 2^floor(log(n)). You consider all possibilities of having either 1 node at the last level, then 2 and so on, until you don't make it a fully balanced binary tree, hence having all 2^floor(log(n)) slots assigned.

No of trees can be constructed from given Inorder/Preorder/Postorder traversal

I know one cannot construct a tree without having both Inorder and Preorder/postorder traversals. Because for a given (only Inorder/Preorder/postorder) there could be a possibility of generating more number of trees. Are there any algorithms or mechanism one can compute the number of unique trees from a given (only Inorder/Preorder/postorder traversal).
Eg : a b c d e f g this is my Inorder traversal.
How many unique trees that can be constructed with the given Inorder traversal.
I tried them is google but none of the explanations are clear
Any help would be appreciated...
Well the algorithm is as follows:
Let, P(N) denote the number of trees possible with N nodes. Let the indexes of the nodes be 1,2,3,...
Now, lets pick the root of the tree. Any of the given N nodes can be the root. Say node i has been picked as root. Then, all the elements to the left of i in the inorder sequence must be in the left sub-tree. Similarly, to the right.
So, total possibilities are: P(i-1)*P(N-i)
In the above expression i varies from 1 to N.
Hence we have,
P(N) = P(0)*P(N-1) + P(1)*P(N-2) + P(2)*P(N-3)....
The base cases will be:
P(0) = 1
P(1) = 1
Thus this can be solved by using Dynamic Programming.
Note that a particular traversal is just a way of labeling the nodes in a tree, so that the number of possible binary trees is the same for any two traversals of the same length. The number of binary trees with n nodes is given by the n-1st Catalan number.
The formula
(2n)!/ (n)!(n+1)!
OR
2n * C(n) / (n+1)
gives the number of possible binary trees for any given INORDER/PREORDER/POSTORDER traversal.

Relationship between number of nodes and height

I am reading The Algorithm Design Manual. The author states that the height of a tree is:
h = log n,
where
h is height
n = number of leaf nodes
log is log to base d, where d is the maximum number of children allowed per node.
He then goes on to say that the height of a perfectly balanced binary search tree, would be:
h = log n
I wonder if n in this second statement denotes 'total number of leaf nodes' or 'total number of nodes'.
Which brings up a bigger question, is there a mathematical relationship between total number of nodes and the height of a perfectly balanced binary search tree?
sure, n = 2^h where h, n denote height of the tree and the number of its nodes, respectively.
proof sketch:
a perfectly balanced binary tree has
an actual branching factor of 2 at each inner node.
equal root path lengths for each leaf node.
about the leaf nodes in a perfectly balanced binary tree:
as the number of leafs is the number of nodes minus the number of nodes in a perfectly balanced binary tree with a height decremented by one, the number of leafs is half the number of all nodes (to be precise, half of n+1).
so h just varies by 1, which usually doesn't make any real difference in complexity considerations. that claim can be illustrated by remembering that it amounts to the same variations as defining the height of a single node tree as either 0 (standard) or 1 (unusual, but maybe handy in distinguishing it from an empty tree).
It doesn't really matter if you talk of all nodes or just leaf nodes: either is bound by above and below by the other multiplied by a constant factor. In a perfectly balanced binary tree the number of nodes on a full level is the number of all nodes in levels above plus one.
In a complete binary tree number of nodes (n) and height of tree (h) have a relationship like this in below.
n = 2^(h+1) -1
this is the all the nodes of the tree

Resources