I was reading about Red Black Trees, and I understand that nodes can either be red or black. The rules say:
1) Every node has a color - either red or black.
2) Root of tree is always black.
3) There are no two adjacent red nodes (A red node cannot have a red parent or red child)
4) Every path from a node (including root) to any of its descendant NULL node has the same number of black nodes.
There is no description of what nodes should be coded as red? I understand that the tree can have all black nodes, but what I don't understand is that when should we mark a node as red?
I would also like to understand the reasoning behind this colour coding. I know that an RB tree is basically a self-balancing tree, but I want to understand how this binary colour coding helps in that?
The usual way is to color all nodes red on insert then perform a fix-up, inserting a red node may not produce any violation at all but inserting a black node always would.
Basic insert algorithm:
insert value as for any BST, coloring the node red if a new node is created, done if the value is already present
LOOP:
if node is tree anchor then set node's color to black and done
if node's parent's color is black then done
if node's uncle's color is red then flip colors of node's parent, node's uncle and node's grandparent, set node to node's grandparent and go to LOOP
if node's parent and node have different orientations then rotate node's parent so that node's parent becomes child to node
otherwise set node to node's parent
flip colors of node and node's parent
rotate at node's parent so that node's parent becomes child to node
done
While inserting a new node, the new node is always inserted as a RED node. After insertion of a new node, if the tree is violating the properties of the red-black tree then, we can recolour or rotate the node.
Related
Properties of Red-Black Tree:
Every node is either red or black.
The root is black.
Every leaf (NIL) is black.
If a node is red, then both its children are black.
For each node, all simple paths from the node to descendant leaves contain the same number of black nodes.
According to the properties, are these valid or invalid red black trees?
A.
I think this is valid
B.
I think this is valid, but I am not sure since there two adjacent red nodes?
C.
I think this is valid, but I am not sure since there two adjacent red nodes?
D.
I think this is not valid since it violate Property 4?
Did I understand these properties of a RBtree right? If not, where am I wrong?
You have listed the properties of Red-Black trees correctly. Of the four trees only C is not a valid red-black tree:
A.
This is a valid tree. Wikipedia confirms:
every perfect binary tree that consists only of black nodes is a red–black tree.
B.
I think this is valid, but I am not sure since there two adjacent red nodes?
It is valid. There is no problem with red nodes being siblings. They just should not be in a parent-child relationship.
C.
I think this is valid, but I am not sure since there two adjacent red nodes?
It is not valid. Not because of the adjacent red nodes, but because of property 5. The node with label 12 has paths to its leaves with varying number of black nodes. Same for the node 25.
As a general rule, a red node can never have exactly one NIL-leaf as child. Its children should either both be NIL-leaves, or both be (black) internal nodes. This follows from the properties.
D.
I think this is not valid since it violate Property 4?
Property 4 is not violated: the children of the red nodes are NIL leaves (not visualised here), which are black. The fact that these red nodes have black NIL leaves as siblings is irrelevant: there are no rules that concern siblings. So this is valid.
For an example that combines characteristics of tree C and D, see this valid tree depicted in the Wikipedia article, which also depicts the NIL leaves:
A, B & D are valid red-black trees
C is not valid red-black tree as the black height from root to leaf is not the same. It is 2 in some paths and 1 in other paths. It violates what you stated as rule 5.
If 12 had a right child that was black and 25 a left child that was black, then it would be a red-black tree.
A red-black tree is basically identical to a 2-3-4 tree(4-Btree), even though the splitting/swapping method is upside down.
2-3-4 trees have fixed-size 3-node buckets. The color black means that it's the central node of the 3-bucket. Any red-black tree is considered as a perfect quadtree/binary tree (of 3-node-buckets) with empty nodes(black holes and red holes).
In other words, every black node (every 3-bucket) has its absolute position in the perfect tree(2 dimensional unique Cartesian or 4-adic/2-adic unique fraction number).
NIL nodes are just extra flags to save space; you don't have enough memory to store a perfect quadtree/binary tree.
The easiest way to check a red-black tree is to check that each black node is a new bucket(going down) and each red node is grouped with the above black node(same bucket). If the central black node has less than 2 red nodes, you can just add empty red holes next to the central black node(left and right).
A new black node is always the grandson of the last black node, and each black node can have only two red daughter-nodes and no black son-nodes. If the red daughter(mother) is empty(dead/unborn), the motherless grandson-node is directly linked to its grandfather-node.
A motherless black grandson-node has no brother, but he can have a black cousin-node next to him; the 2 cousins are linked to the same grandfather.
A quadtree is a subset of a binary tree.
All black nodes have even heights(2,4,6...), and all red nodes have odd heights(1,3,5...). Optionally, you can use the half unit 0.5.
The 3-bucket has a fixed size 3; just add extra red holes(unborn unlinked red daughters) to make the size 3.
I'm working on data structures. There is something I do not understand about Red and Black trees. He always writes about the following features about these trees. But in all examples the values of the leaves are null. Why is that not in the features either. Why "All leaf nodes are black and blank." not?
Red/Black Property: Every node is colored, either red or black.
Root Property: The root is black.
Leaf Property: Every leaf (NIL) is black.
Red Property: If a red node has children then, the children are always black.
Depth Property: For each node, any simple path from this node to any of its descendant leaf has the same black-depth (the number of black nodes).
It doesn't really matter if all nodes are keyless and black or not.
If the nodes could be any color and/or empty, the asymptotics would not change at all, since red children cannot have red parents and all internal nodes have associated keys.
All paths would still have lengths that are at most a factor of two different in total number of keys, though now that would be 2L+2, rather than 2L for the longest path compared to the shortest path (of length L).
If there is such an edge, then, it means the white node is the parent node and gray node is discovered while scanning the adjacency list on white node, which in turn will make the white node, gray.
There can only be back nodes from white to gray nodes, according to me. If someone can clarify, where am I going wrong.
EDIT
WHITE node: Undiscovered node
GRAY node: Discovered, but not yet closed.
According to this explanation of red black tree, the tree must have the following properties:
A node is either red or black.
The root is black. (This rule is sometimes omitted. Since the root can always be changed from red to black, but not necessarily
vice-versa, this rule has little effect on analysis.)
All leaves (NIL) are black. (All leaves are same color as the root.)
Both children of every red node are black.
Every simple path from a given node to any of its descendant leaves contains the same number of black nodes.
What is stopping someone making every single node black?
It is possible. But to maintain condition 5, sometimes you might want to color a node RED.
e.g., Consider the following example.
a
/ \
b c
Here all nodes can be BLACK
Now if you want to insert a new node, which color will you choose? RED. since if you choose black, the condition 5 will not be satisfied. So basically you can keep inserting RED nodes unless any of the conditions (1-4) is not broken
The last rule you quoted is "Every simple path from a given node to any of its descendant leaves contains the same number of black nodes."
If all nodes are black, then the path from the root to any leaf must contain the same number of nodes. In other words, all leaves are at the same depth - so this is only possible for a perfect binary tree.
I am looking at the code of inserting into an augmented Red black tree. This tree has an additional field called "size" and it keeps the size of a subtree rooted at a node x. Here is the pseudocode for inserting a new node:
AugmentedRBT_Insert(T,x){
BST_Insert(T,x); //insert as if it is a normal BST
x[color]=red; //insert as a red node
size[x]=1;
tmp=parent[x];
while(tmp!=NULL){ //start from the node x and follow the path to root
size[tmp]=size[tmp]+1; //update the size of each node
tmp=parent[tmp];
}
}
Forget about fixing the coloring and rotations, they will be done in another function. My question is, why do we set the size of the newly added node "x" to 1? I understand that it will not have any subtrees, so its size must be 1, but one of the requirements of RBT is that every red node has two black children, in fact every leaf node is NULL and even if we insert the node "x" as black, it still should have 2 black NULL nodes and i think we must set its size to 3? Am i wrong?
Thanks.
An insertion in a red-black tree, as in most binary trees, happens directly at a leaf. Hence the size of the subtree rooted at the leaf is 1. The red node does have two black children, because leaves always have the "root" or "nil" as a child, which is black. Those null elements aren't nodes, so we wouldn't count them.
Then, we go and adjust the sizes of all parents up to the root (they each get +1 for the node we just added).
Finally, we fix these values when we rotate the tree to balance it, if necessary. In your implementation, you will probably want to do both the size updates and rotations in one pass instead of two.