Number of Nodes with Specific Black-Height in Red-Red-Black trees - data-structures

I was asked in a homework assignment to answer a question regarding "Red-Red-Black" trees. The description of a red-red-black tree (copied from somewhere in the internet) is:
"A red-red-black tree is a binary search tree that satisfies the following conditions:
Every node is either red or black
Every leaf (nil) is black
If a node is red and it's parent is red, then both its children are black
Every simple path from a node to a descendant leaf contains the same number of black nodes (the black-height of the tree)"
I was asked, given a red-red-black tree with n nodes, what is the largest number of internal nodes with black-height k? What's the smallest number?
I've been trying to think about it for more then two hours now, but apart from headache I couldn't get anywhere.
Thanks!

maximum number of nodes : (2^2k)-1
minimum number of nodes : (2^k)-1

Two Red Node can never appear continuously.
Number of Black node should be equal in when you traverse through any path.

Related

Properties of Red-Black Tree

Properties of Red-Black Tree:
Every node is either red or black.
The root is black.
Every leaf (NIL) is black.
If a node is red, then both its children are black.
For each node, all simple paths from the node to descendant leaves contain the same number of black nodes.
According to the properties, are these valid or invalid red black trees?
A.
I think this is valid
B.
I think this is valid, but I am not sure since there two adjacent red nodes?
C.
I think this is valid, but I am not sure since there two adjacent red nodes?
D.
I think this is not valid since it violate Property 4?
Did I understand these properties of a RBtree right? If not, where am I wrong?
You have listed the properties of Red-Black trees correctly. Of the four trees only C is not a valid red-black tree:
A.
This is a valid tree. Wikipedia confirms:
every perfect binary tree that consists only of black nodes is a red–black tree.
B.
I think this is valid, but I am not sure since there two adjacent red nodes?
It is valid. There is no problem with red nodes being siblings. They just should not be in a parent-child relationship.
C.
I think this is valid, but I am not sure since there two adjacent red nodes?
It is not valid. Not because of the adjacent red nodes, but because of property 5. The node with label 12 has paths to its leaves with varying number of black nodes. Same for the node 25.
As a general rule, a red node can never have exactly one NIL-leaf as child. Its children should either both be NIL-leaves, or both be (black) internal nodes. This follows from the properties.
D.
I think this is not valid since it violate Property 4?
Property 4 is not violated: the children of the red nodes are NIL leaves (not visualised here), which are black. The fact that these red nodes have black NIL leaves as siblings is irrelevant: there are no rules that concern siblings. So this is valid.
For an example that combines characteristics of tree C and D, see this valid tree depicted in the Wikipedia article, which also depicts the NIL leaves:
A, B & D are valid red-black trees
C is not valid red-black tree as the black height from root to leaf is not the same. It is 2 in some paths and 1 in other paths. It violates what you stated as rule 5.
If 12 had a right child that was black and 25 a left child that was black, then it would be a red-black tree.
A red-black tree is basically identical to a 2-3-4 tree(4-Btree), even though the splitting/swapping method is upside down.
2-3-4 trees have fixed-size 3-node buckets. The color black means that it's the central node of the 3-bucket. Any red-black tree is considered as a perfect quadtree/binary tree (of 3-node-buckets) with empty nodes(black holes and red holes).
In other words, every black node (every 3-bucket) has its absolute position in the perfect tree(2 dimensional unique Cartesian or 4-adic/2-adic unique fraction number).
NIL nodes are just extra flags to save space; you don't have enough memory to store a perfect quadtree/binary tree.
The easiest way to check a red-black tree is to check that each black node is a new bucket(going down) and each red node is grouped with the above black node(same bucket). If the central black node has less than 2 red nodes, you can just add empty red holes next to the central black node(left and right).
A new black node is always the grandson of the last black node, and each black node can have only two red daughter-nodes and no black son-nodes. If the red daughter(mother) is empty(dead/unborn), the motherless grandson-node is directly linked to its grandfather-node.
A motherless black grandson-node has no brother, but he can have a black cousin-node next to him; the 2 cousins are linked to the same grandfather.
A quadtree is a subset of a binary tree.
All black nodes have even heights(2,4,6...), and all red nodes have odd heights(1,3,5...). Optionally, you can use the half unit 0.5.
The 3-bucket has a fixed size 3; just add extra red holes(unborn unlinked red daughters) to make the size 3.

Why are leaves blank on all rb trees?

I'm working on data structures. There is something I do not understand about Red and Black trees. He always writes about the following features about these trees. But in all examples the values of the leaves are null. Why is that not in the features either. Why "All leaf nodes are black and blank." not?
Red/Black Property: Every node is colored, either red or black.
Root Property: The root is black.
Leaf Property: Every leaf (NIL) is black.
Red Property: If a red node has children then, the children are always black.
Depth Property: For each node, any simple path from this node to any of its descendant leaf has the same black-depth (the number of black nodes).
It doesn't really matter if all nodes are keyless and black or not.
If the nodes could be any color and/or empty, the asymptotics would not change at all, since red children cannot have red parents and all internal nodes have associated keys.
All paths would still have lengths that are at most a factor of two different in total number of keys, though now that would be 2L+2, rather than 2L for the longest path compared to the shortest path (of length L).

Largest and smallest number of internal nodes in red-black tree?

The smallest number of internal nodes in a red-black tree with black height of k is 2k-1 which is one in the following image:
The largest number of internal nodes with black height of k is 22k-1 which, if the black height is 2, should be 24 - 1 = 15. However, consider this image:
The number of internal nodes is 7. What am I doing wrong?
(I've completely rewritten this answer because, as the commenters noted, it was initially incorrect.)
I think it might help to think about this problem by using the isometry between red-black trees and 2-3-4 trees. Specifically, a red-black tree with black height h corresponds to a 2-3-4 tree with height h, where each red node corresponds to a key in a multi-key node.
This connection makes it easier for us to make a few neat observations. First, any 2-3-4 tree node in the bottom layer corresponds to a black node with either no red children, one red child, or two red children. These are the only nodes that can be leaf nodes in the red-black tree. If we wanted to maximize the number of total nodes in the tree, we'd want to make the 2-3-4 tree have nothing but 4-nodes, which (under the isometry) maps to a red/black tree where every black node has two red children. An interesting effect of this is that it makes the tree layer colors alternate between black and red, with the top layer (containing the root) being black.
Essentially, this boils down to counting the number of internal nodes in a complete binary tree of height 2h - 1 (2h layers alternating between black and red). This is equal to the number of nodes in a complete binary tree of height 2h - 2 (since if you pull off all the leaves, you're left with a complete tree of height one less than what you started with). This works out to 22h - 1 - 1, which differs from the number that you were given (which I'm now convinced is incorrect) but matches the number that you're getting.
You need to count the black NIL leafs in the tree if not this formula won't work. The root must not be RED that is in violation of one of the properties of a Red-Black tree.
The problem is you misunderstood the black height.
The black height of a node in a red-black tree is the the number of black nodes from the current node to a leaf not counting the current node. (This will be the same value in every route).
So if you just add two black leafs to every red node you will get a red-black tree with a black height of 2 and 15 internal nodes.
(Also in a red-black tree every red node has two black children so red nodes can't be leafs.)
After reading the discussion above,so if I add the root with red attribute, the second node I add will be a red again which would be a red violation, and after node restructuring, I assume that we again reach root black and child red ! with which we might not get (2^2k)-1 max internal nodes.
Am I missing something here , started working on rbt just recently ...
It seems you havent considered the "Black Leaves" (Black nodes) -- the 2 NIL nodes for each of the Red Nodes on the last level. If you consider the NIL nodes as leaves, the Red nodes on the last level now get counted as internal nodes totaling to 15.
The tree given here actually has 15 internal nodes. The NIL black children of red nodes in last layer are missing which are actually called external nodes ( node without a key ). The tree has black-height of 2. The actual expression for maximum number of internal nodes for a tree with black-height k is 4^(k)-1. In this case, it turns out to be 15.
In red-black trees, external nodes[null nodes] are always black but in your question for the second tree you have not mentioned external nodes and hence you are getting your count as 7 but if u mention external nodes[null nodes] and then count internal nodes you can see that it turns out to be 15.
Not sure that i understand the question.
For any binary tree where all layers (except maybe last one) have max number of items we will have 2^(k-1)-1 internal nodes, where k is number of layers. At second picture you have 4 layers, so number of internal nodes is 2^(4-1)-1=7

Inserting into Augmented Red Black Tree

I am looking at the code of inserting into an augmented Red black tree. This tree has an additional field called "size" and it keeps the size of a subtree rooted at a node x. Here is the pseudocode for inserting a new node:
AugmentedRBT_Insert(T,x){
BST_Insert(T,x); //insert as if it is a normal BST
x[color]=red; //insert as a red node
size[x]=1;
tmp=parent[x];
while(tmp!=NULL){ //start from the node x and follow the path to root
size[tmp]=size[tmp]+1; //update the size of each node
tmp=parent[tmp];
}
}
Forget about fixing the coloring and rotations, they will be done in another function. My question is, why do we set the size of the newly added node "x" to 1? I understand that it will not have any subtrees, so its size must be 1, but one of the requirements of RBT is that every red node has two black children, in fact every leaf node is NULL and even if we insert the node "x" as black, it still should have 2 black NULL nodes and i think we must set its size to 3? Am i wrong?
Thanks.
An insertion in a red-black tree, as in most binary trees, happens directly at a leaf. Hence the size of the subtree rooted at the leaf is 1. The red node does have two black children, because leaves always have the "root" or "nil" as a child, which is black. Those null elements aren't nodes, so we wouldn't count them.
Then, we go and adjust the sizes of all parents up to the root (they each get +1 for the node we just added).
Finally, we fix these values when we rotate the tree to balance it, if necessary. In your implementation, you will probably want to do both the size updates and rotations in one pass instead of two.

IOI 2003 : how to calculate the node that has the minimum balance in a tree?

here is the Balancing Act problem that demands to find the node that has the minimum balance in a tree. Balance is defined as :
Deleting any node
from the tree yields a forest : a collection of one or more trees. Define the balance of a node to be the size of the largest tree in the forest T created by deleting that node from T
For the sample tree like :
2 6 1 2 1 4 4 5 3 7 3 1
Explanation is :
Deleting node 4 yields two trees whose member nodes are {5} and {1,2,3,6,7}. The
larger of these two trees has five nodes, thus the balance of node 4 is five. Deleting node
1 yields a forest of three trees of equal size: {2,6}, {3,7}, and {4,5}. Each of these trees
has two nodes, so the balance of node 1 is two.
What kind of algorithm can you offer to this problem?
Thanks
I am going to assume that you have had a looong look at this problem: reading the solution does not help, you only get better at solving these problems by solving them yourself.
So one thing to observe is, the input is a tree. That means that each edge joins 2 smaller trees together. Removing an edge yields 2 disconnected trees (a forest of 2 trees).
So, if you calculate the size of the tree on one side of the edge, and then on the other, you should be able to look at a node's edges and ask "What is the size of the tree on the other side of this edge?"
You can calculate the sizes of trees using dynamic programming - your recurrence state is "What edge am I on? What side of the edge am I on?" and it calculates the size of the tree "hung" at that node. That is the crux of the problem.
Having that data, it is sufficient to iterate through all the nodes, look at their edges and ask "What is the size of the tree on the other side of this edge?" From there, you just pick the minimum.
Hope that helps.
You basically want to check 3 things for every node:
The size of its left subtree.
The size of its right subtree.
The size of the rest of the tree. (size of tree - left - right)
You can use this algorithm and expand it to any kind of tree (different number of subnodes).
Go over the tree in an in-order sequence.
Do this recursively:
Every time you just before you back up from a node to the "father" node, you need to add 1+size of node's total sub trees, to the "father" node.
Then store a value, let's call it maxTree, in the node that holds the maximum between all its subtrees, and the (sum of all subtrees)-(size of tree).
This way you can calculate all the subtree sizes in O(N).
While traversing the tree, you can hold a variable that hold the minimum value found so far.

Resources